Newsgroups: comp.lang.lisp, sci.math.num-analysis Followup-To: sci.math.num-analysis From: hoey@zogwarg.etl.army.mil (Dan Hoey) Date: 29 Jul 91 22:32:44 GMT Subject: Re: random selection w/o replacement The key point here is to manage the non-replacement by exchanging the selected elements with elements at one end of the sequence. That way the unselected elements form a contiguous subsequence, which you can index into with (random number-remaining). That way, if your sequence is a vector, you take time linear in the number selected. If you have to do this a lot of times, you can use the vector over and over again. It just gets more and more randomized. And of course if you select all the elements, you've got a shuffle. (defun select-without-replacement (n eseq &aux (len (length eseq))) (dotimes (used n (subseq eseq (- len n))) (rotatef (elt eseq (random (- len used))) (elt eseq (- len used 1))))) >if the number of items you're selecting is << than the length of the >sequence, then the following is simple and probably okay: >(defun randum[sic]-elements (n eseq) > (do ((len (length eseq)) > (result nil)) > ((= n (length result)) result) > (pushnew (elt eseq (random len)) result))) The I'th element into result will take Theta(I + I^2/LEN) time on average, so the total time will be Theta(N^2 + N^3/LEN). Provided, of course, that ESEQ is a vector. >... This can be improved by doing the member test explic[i]tly, >thereby avoiding taking the length of result on each iteration.... Pretty minimal speedup, since pushnew (or member) has to look at the whole list anyway. >A better algorithm is to copy the sequence into an array (fast, no >garbage), randomize the array (slow?, no garbage), and then take the >first N values out of the array (fast, no garbage). As I've shown, you only need to randomize the part of the array you are going to use. >I tend to swap the 0th element with a random element a large number >of times.... Now that is just wrong. Worse, that is a common beginner's mistake for shuffling a deck of cards, and you should get over it before you start giving advice on the subject. Your method will randomize the first element of the array, but you will never randomize the rest of the array (unless LEN<=2). It's easy to see you can't get a random shuffle, because the the probability of each outcome is a terminating fraction in base LEN arithmetic. The way to shuffle a deck of card is to--guess what--select without replacement (until you reach the end then stop). The above function will do it. In some languages it's easier to collect the selected items at the beginning of the array, and that's a reader's exercise. But remember: if you keep calling random with the same argument, you should start thinking about how that has got to lose. Followups belong in an algorithms group: this is not language-specific. I guess sci.math.num-analysis is about it. Dan Hoey Hoey@AIC.NRL.Navy.Mil