Monday, September 27, 2010

Google's Director of Research on understanding data

There are a lot of sound-bite conspiracy theories about what Google is doing with the data it collects on you.

Here's another perspective: a talk given by Google's Director of Research at UBC last week, on what they are trying to do with all that data.
http://www.youtube.com/watch?v=9vR8Vddf7-s

"In decades past, models of human language were wrought from the sweat and pencils of linguists. In the modern day, it is more common to think of language modeling as an exercise in probabilistic inference from data: we observe how words and combinations of words are used, and from that build computer models of what the phrases mean. This approach is hopeless with a small amount of data, but somewhere in the range of millions or billions of examples, we pass a threshold, and the hopeless suddenly becomes effective, and computer models sometimes meet or exceed human performance. This talk gives examples of the data available in large repositories of text, images, and videos, and shows some tasks that can be accomplished with the resulting models."

No comments:

Post a Comment