Date: Sun, 19 Dec 2010 13:38:25 -0500
From: "Michael Walsh" <mjw at press.jhu.edu>
To: "WSFA members" <WSFAlist at KeithLynch.net>
Subject: [WSFA] gah! let's try that again - Re: google word list
Reply-To: WSFA members <WSFAlist at KeithLynch.net>

> Tamar Lindsay <dicconf at yahoo.com> 12/19/2010 12:45 PM >>>
>
>whitroth at 5-cent.us <whitroth at 5-cent.us wrote:
>> I wish google as a whole would at least
>> offer single quotes, so that it does *not* try
>> to interpret what I want, since they're *wrong*.
>
>I wish they'd also proofread their scanning.
>Some online "books" are almost entirely illegible,
>due to typefaces the machine doesn't recognize.
>Every time the OCR scanner damages a word (as when
>"and" becomes "ancl"), that instance is lost from
>the database and the statistical analysis fails.

Rather than typos, we have scanos?

Anyway ... blame technology.  Here's how some of it is done: <http://www.np=
r.org/blogs/library/2009/04/the_granting_of_patent_7508978.html>

And according to : http://en.wikipedia.org/wiki/Google_Books "Many of the =
books are scanned using the Elphel 323 camera at a rate of 1,000 pages per =
hour".

Too many pages to proofread would be they're argument I would imagine.

Eventually ... the Encyclopedia Galactica !

mjw