Page 1 of 1

False Nearest Neighbors strange behavior

Posted: Mon Jan 14, 2013 17:40
by Alveuz
Hey there,

currently im working with CRP trying to reconstruct the phase space of financial time series but Im obtaining weird results, ill try to make me clear.
In order to obtain a delay embedding coordinates representation from financial time series, we are using Mutual Information (MI) and False Nearest Neighbors (FNN) as proposed by [Takens, 1981]; the implementation is based on CRP classes working in MATLAB 2009b environment. As far as we understand, Mutual Information and False Nearest Neighbors are deterministic methods, in this sense, as long as the same data and the same parameters are used MI and FNN they must provide the exact same results, right? This conception makes CRP's FNN results difficult to understand. So here comes our questions:
1. First, why would the MAX DIMENSION parameter would affect the Embedding dimension obtained? FNN states as false nearest neighbor any vector which would not satisfy two criteria: The euclidian distance between two vectors with two difference sizes would not be greater than a R_tol Threshold, and the degree of dispersion between a vector and all vectors standard deviation would not be greater than a R_accept threshold. We dont understand why when we change in the MAX DIMENSION parameter we dont obtained the same embedding dimension (keep in mind that we always increase this parameter from ranges of 20-50)
2. Even when using the same exact parameters (DATA, MAX DIMENSION, TIME DELAY, NEIGHBOURHOOD CRITERION, NEIGHBOURHOOD SIZE, 'EUCLIDIAN DISTANCE'), running several times the programm we obtain several results for the correct embedding dimension.

We would appreciate any information about this issue.
Thanks in advance

Guillermo Santamaria :mrgreen:

Re: False Nearest Neighbors strange behavior

Posted: Tue Feb 5, 2013 18:23
by Norbert
Hi,

the answer is simple. Because of the long time, a Matlab code will need to calculate the false nearest neighbours, not all points are used, but instead only a subset of 200 points. These points are randomly chosen, therefore, causing slightly different results when calling this function several times. This caveat is mentioned in the help text of this function. If you need a calculation using all points, simply specify it:
Y=fnn(X,M,T,R,S,N)
where N should then be the length of the time series. Then, also the parameter max-dim will not have any impact on the number of false nearest neighbours, as this was also only an effect of the randomly drawn samples.

Mostly, the difference of the results is not very strong, as the tendency of the drop of the number of false nearest neighbours is mostly unaffected. For a rough estimation for the embedding dimension it is enough.

If you need a faster implementation using all data points, you might think about a C implementation (e.g., TISEAN should contain one).

I hope this clarifies this issue.

Best regards
Norbert

Re: False Nearest Neighbors strange behavior

Posted: Sun Feb 10, 2013 21:32
by viniciusjdv
Prof. Norbert,
I want to know what those parameters mean: R and S, on the function fnn.

Thanks

Re: False Nearest Neighbors strange behavior

Posted: Thu Feb 21, 2013 21:15
by Norbert
Please have a look at
Kennel, M. B., Brown, R., Abarbanel, H. D. I.: Determining embedding dimension for phase-space reconstruction using a geometrical construction, Phys. Rev. A, 45, 1992.