Date: Sun, 19 Dec 2010 13:38:25 -0500 From: "Michael Walsh" <mjw at press.jhu.edu> To: "WSFA members" <WSFAlist at KeithLynch.net> Subject: [WSFA] gah! let's try that again - Re: google word list Reply-To: WSFA members <WSFAlist at KeithLynch.net> > Tamar Lindsay <dicconf at yahoo.com> 12/19/2010 12:45 PM >>> > >whitroth at 5-cent.us <whitroth at 5-cent.us wrote: >> I wish google as a whole would at least >> offer single quotes, so that it does *not* try >> to interpret what I want, since they're *wrong*. > >I wish they'd also proofread their scanning. >Some online "books" are almost entirely illegible, >due to typefaces the machine doesn't recognize. >Every time the OCR scanner damages a word (as when >"and" becomes "ancl"), that instance is lost from >the database and the statistical analysis fails. Rather than typos, we have scanos? Anyway ... blame technology. Here's how some of it is done: <http://www.np= r.org/blogs/library/2009/04/the_granting_of_patent_7508978.html> And according to : http://en.wikipedia.org/wiki/Google_Books "Many of the = books are scanned using the Elphel 323 camera at a rate of 1,000 pages per = hour". Too many pages to proofread would be they're argument I would imagine. Eventually ... the Encyclopedia Galactica ! mjw