It does not make any difference to consider either the numbers or the balls itself. Why the drawn numbers would be independent? Because the drawn balls are independent. Which ball will be drawn is determined really by chance. A lottery machine is not such a simple system what you supposed. Moreover, it is not only the trivial mechanical problem of the lottery machine itself (the movement of one ball) - the process of drawing a certain ball is also influenced by many other factors, which can be temperatur, slight differences in the ball composition, time between several draws etc. This means the entire system has a really high degree of freedom, ensuring the randomness of which ball or which number will be drawn. A fundamental behaviour of such complex systems is, that a very small deviation in the initial state causes a very large deviation after some time. This is the reason why such systems are not predictable; even not with a recurrence analysis. Well, such systems come back to the initial state after some time, but not exactly, there will be a finite deviation. And in lottery it makes a rather big difference if the ball is "almost the same", i.e. instead of drawing ball "A", ball "B" will be drawn.
By the way: systems with such a high degree of freedom would need a really high embedding dimension. Let us guess that the degree of freedom of such a lottery system will not be below 100 (in reality it is even much more!), then the embedding dimension should be at least 200.
And please note: the delay embedding increases the correlation between the phase space vectors. If you use such an embedding for a prediction, then you can get wrong results.
Now I had a look on the posts about "Lottery Prediction by Recurrence Analysis" at
http://www.lotterypost.com and at the material on the web site
http://zarnia.250free.com. I think I should write you an eMail with some comments. But here I would like to say that I agree with the comments given by Eugene. You will not be successful in predicting lottery numbers, even not by using any kind of recurrence analysis. I hope that Eugene was successful in proving this by means of the recurrence analysis.
Eugene provided a predicition algorithm based on the recurrences. I do not know how this algorithm is working. But I'm quite sure that it is not able to predict exactly numbers - only within an intervall of numbers.
You mentioned that the best results ocurred for a delay which is as large as the number of drawn numbers. This is trivial. As Eugene already stated, the numbers in one draw are ordered. Therefore, if you plot the drawn numbers sequentially, you will see a saw tooth with a period which is equal to the drawn numbers. The highest autocorrelation can be found in delays which are exactly the multiple of the period. I guess, this is also the reason why you will find a phase space reconstruction (if we can call this at all a phase space) which seems to be composed by triangles - it is only the ordered structure within the data.
However, if you have still the mood to follow the analysis of lottery numbers with the recurrence plot approach, let me suggest the following points, which are more interesting or more appropriate:
- use the drawn numbers as a phase space vector, dimension is then the number of drawn balls, delay=1 (this avoids the correlations between the phase space vectors by the embedding)
- create a phase space which has the dimension of all available numbers (e.g. 49) and now create phase space vectors on the basis of the drawn numbers such that the component of the phase space vector which corresponds to the drawn number is set to one (else zero), e.g. the first draw was 1,5,12,28,33,45, then all components of the phase space vector will be zero except the components 1,5,12,28,33 and 45.