Learn Tracker: Life Long Learning with MOOCs

C O U R S E

Web Intelligence and Big Data

Gautam Shroff, Indian Institute of Technology Delhi

https://www.coursera.org/course/bigdata

C O U R S E L E C T U R E

Search and Basic Indexing

Notes taken on June 16, 2014 by Edward Tanguay

•

Turing Test

•

based on 1950s party game where a person takes type-written answers to questions and tries to determine if it was written by a man or a woman

•

human tries to determine if text is written by human or computer

•

a CAPTCHA is an example of a Turing Test in reverse

•

examples of successful artificial intelligence

•

instant translations of hundreds of texts

•

object recognition of by e.g. Google Googles

•

machines recognizing faces e.g. on Facebook

•

big data

•

billions of Facebook pages

•

hundreds of million tweets a day

•

millions of servers, petabytes of data

•

old-style business intelligence

•

databases, clean the data, data warehouse, more database, statistics

•

new-style business intelligence (Google, Facebook, etc.)

•

massive parallelism

•

Map-Reduce paradigm

•

this is the heart of big data technology

•

relationship between data and intelligence

•

the difference between data and intelligence is with intelligence, you can predict

•

applications that use big data to achieve predictive intelligence

•

online advertising predicting our intent and interest

•

gauging consumer sentiment and predicting behavior

•

detecting adverse events and predicting their impact

•

fires

•

floods

•

earthquakes

•

recognizing places and faces

•

personalize genomic medicine

•

medicines actually have different affects on each person

•

intelligent public services for energy, water

•

intelligent sensors

•

deep analytics

•

securing ourselves from criminals

•

big data analytics

•

fusing social intelligence with business intelligence

•

data-driven business models and processes

•

how to predict the future with artificial intelligence and big data

•

looking

•

listening

•

learning

•

connecting

•

predicting

•

correcting

•

looking

•

the purpose of looking is to find stuff

•

on the web

•

on one's computer

•

in one's memories

•

finding stuff is essential to intelligent behavior

•

on the web, finding stuff is about finding documents

•

we type in words and expect to find documents

•

if we type in "large bird" we want documents which contain both "large" and "bird" sorted by those on top which have both "large" and "bird"

•

we do this via indexing

•

in a binary tree, looking up a document takes O(log m)

•

using a hash might be faster

•

in multiple-term queries, you would also need to sort them

•

the time it takes is O(r q) if r = number of intermediate results in all