Date: 29 Mar 1987 19:20:53 EST (Sun)
From: Dan Hoey <hoey@nrl-aic.ARPA>
Subject: Re: Randomness
To: KFL@AI.AI.MIT.EDU

Keith,
  Sorry I didn't respond to you request for ``plane-filling''
distributions.  I have some answers for that, too.  But just in case
you missed it, this just showed up on PHYSICS....

. . .

I wonder if anyone is considering gatewaying sci.math to Internet.
Maybe the CSNET guy who recently took over the sci.math.symbolic
list....

Anyway, to your new problem

  Date: Sun, 29 Mar 87 17:56:20 EST
  From: "Keith F. Lynch" <KFL@AI.AI.MIT.EDU>

  ...It is no longer important that the random variables have a flat
  distribution.  What is important is that:

  1) The expectation of each random variable be zero, i.e. the mean
     is zero.

This is just normalization, since the correlation coefficent of X wrt Y
is unchanged if we replace X with X-E(X).

  2) The autocorrelation of each random variable is zero, i.e. the power
     spectral density is flat out to the Nyquist limit.

I'll need to look this stuff up to understand it.  I'm working from
home, and my references are at work.  You say it works for uniform on
[-1,1], so I'll mostly just assume that.  Later, you consider uniform
discrete {-1,1} -- if that's acceptable, it may simplify the problem.

  3) The variance (expectation of the square) be as specified.  If
     you can make it 1, I can scale it.

For uniform X on [-1,1], E(X^2) = 1/3, so I guess you are scaling
already.  And considering, I guess the correlation coefficient is
invariant under scaling one of the RV's, so this again doesn't make the
problem harder.

  4) There may be ANY NUMBER of random variables, and the correlations
     between them must be as specified....

This is interesting.  You might also consider the N-way correlation
coefficent, which I think is
       E(product[i=1,...,n](X[i]))
  -------------------------------------
  (product[i=1,...,n](var(X[i])))^(1/2)
but that might be difficult.

  Your solution solved the problem just fine for two random variables.
  But I can't find a way to generalize it to three or more random
  variables.

Here's another solution.  Suppose you have pairs of RV's, (X1,Y1) and
(X2,Y2), such that (var(Xi) var(Yi)) is constant.  Let (X,Y) = (X1,Y1)
with probability p and (X2,Y2) with probability (1-p).  Then the
correlation coefficients satisfy:
  CC(X,Y) = p CC(X1,Y1) + (1-p) CC(X2,Y2).
This could work with the original correlation-1 and correlation-(-1)
RV-pairs to solve your original problem.  And it could be adulterated
with your correlation-0 RV-pair to solve the problem of the RV-pair
covering the square.  I nearly sent you this, but I was beginning to
suspect I would then be asked for an RV-pair with a continuous, or at
least nonsingular, distribution over the square.  I played with this
some, and I think it shouldn't be too hard, but my calculus is pretty
rusty.

Anyway, this could solve many instances of the N-variable case, and it
may be that it could solve all the ones that have solutions.  We form
a basis of RV-tuples of the form B[1],...,B[N], with B[1] uniform on
[-1,1], and B[2],...,B[N] all either B[1] or -B[1].  There are 2^(N-1)
such basis tuples.  Take the entries in the correlation matrix for such
a tuple as coordinates in N^2-dimensional Euclidean space.  Likewise
map the matrix for our target distribution.  If the target point
lies within the convex hull of the basis points, then the target point
can be expressed as an interpolant of basis points, and the target
distribution can be interpolated from the basis distributions.  (An
interpolant of A1,A2,...,AN is R1 A1 + R2 A2 + ... + RN AN with all Ri
in [0,1], and summing to 1.)

  I came up with a completely different method for doing this....
  I then take the square root of [the correlation coefficient]
  matrix.  If I multiply a vector of uncorrelated random variables
  which always equal -1 or +1 by the matrix, I will get a vector of
  correlated random variables with a variance of 1 and a mean of 0,
  which is just what I want.  Unfortunately this only works for two
  (or one) random variables.  I don't understand why it works at all,
  so I don't understand why it fails for three or more random variables.

I don't know why it works, either.

  It occurs to me that there must be constraints on what correlations
  three random variables can have.  For instance if cor(XY) is 1, then
  cor(YZ) must be equal to cor(XZ) (I think).

Yes, since cor(XY)=1 => prob(X=Y)=1.  This is related to the inequality
(Schwartz's?) that implies that correlation coefficents lie in [-1,1].

  Is there any general rule?

Probably we need to examine the inequality more closely.  I'll do it if
I find the time, or maybe you should.  I know there is a
multiple-variable analogue of the inequality.

Let me know what you find out.  Maybe you should ask
karl%haddock.uucp@seismo.css.gov to copy you on the sci.math discussion.

Dan