Date: Wed, 19 Mar 2003 09:45:40 -0500 From: "Michael Walsh" <MJW at mail.press.jhu.edu> To: <WSFAlist at keithlynch.net> Subject: [WSFA] Re: Predicting who will be at WSFA meetings Reply-To: WSFA members <WSFAlist at keithlynch.net> >ronkean at juno.com 03/19/03 03:46AM >>On Sat, 15 Mar 2003 14:54:27 -0500 (EST) "Keith F. Lynch" ><kfl at KeithLynch.net> writes: >... With 143 people who have attended >> three or more of 192 meetings I have data for, a total of 5149 data >> points, I finally have a fair amount of data to play with. > >You presumably mean that 5149 is the cumulative recorded attendance over >that time period, (but just among those 143 people), much the same way >that an airline might report carrying a million passengers in a year, >even though many of those million were repeat customers during the = year. >It occurs to me that an instance of non-attendance is also a data point, >so there would be 27,456 (143 x 192) data points in all, at least in the >sense that a tabular representation of the data would have 27,456 cells >in the table. Also, it seems that the average attendee who had been to >at least three meetings during the period, went to only about 19% of the >meetings (5149 / 27,456). > >Doubtless you will be able to construct some formula which has predictive >value, one that has inputs such as past attendance with time weighting, >weather and traffic conditions, meeting location, competing events, time >of year, time of the month and proximity to major holidays, etc. But >each attendee is different; for example one person might be more likely >to come to a meeting when it is raining, and another may be less likely >show up when it is raining. So it seems like you would have to come up >with 143 formulas. Perhaps a good way to tackle the problem would be to >first develop a single formula to predict the number of attendees at a >meeting. With the large amount of data overall, that analysis would be >most sensitive to subtle factors, much more so than the data for any one >individual, and so would be a good way to identify and quantify at least >some of the relevant factors. "Paging Hari Sheldon . . . paging Hari Sheldon . . . . you are needed . . = ." mjw