Date: Wed, 19 Mar 2003 09:45:40 -0500
From: "Michael Walsh" <MJW at mail.press.jhu.edu>
To: <WSFAlist at keithlynch.net>
Subject: [WSFA] Re: Predicting who will be at WSFA meetings
Reply-To: WSFA members <WSFAlist at keithlynch.net>

>ronkean at juno.com 03/19/03 03:46AM
>>On Sat, 15 Mar 2003 14:54:27 -0500 (EST) "Keith F. Lynch"
><kfl at KeithLynch.net> writes:
>...  With 143 people who have attended
>> three or more of 192 meetings I have data for, a total of 5149 data
>> points, I finally have a fair amount of data to play with.
>
>You presumably mean that 5149 is the cumulative recorded attendance over
>that time period, (but just among those 143 people), much the same way
>that an airline might report carrying a million passengers in a year,
>even though many of those million were repeat customers during the =
year.
>It occurs to me that an instance of non-attendance is also a data point,
>so there would be 27,456 (143 x 192) data points in all, at least in the
>sense that a tabular representation of the data would have 27,456 cells
>in the table.  Also, it seems that the average attendee who had been to
>at least three meetings during the period, went to only about 19% of the
>meetings (5149 / 27,456).
>
>Doubtless you will be able to construct some formula which has predictive
>value, one that has inputs such as past attendance with time weighting,
>weather and traffic conditions, meeting location, competing events, time
>of year, time of the month and proximity to major holidays, etc.  But
>each attendee is different; for example one person might be more likely
>to come to a meeting when it is raining, and another may be less likely
>show up when it is raining.  So it seems like you would have to come up
>with 143 formulas.  Perhaps a good way to tackle the problem would be to
>first develop a single formula to predict the number of attendees at a
>meeting.  With the large amount of data overall, that analysis would be
>most sensitive to subtle factors, much more so than the data for any one
>individual, and so would be a good way to identify and quantify at least
>some of the relevant factors.

"Paging Hari Sheldon . . . paging Hari Sheldon . . . .  you are needed . . =
."

mjw