Description of Series

The current practice of using Google (and other engines) to search the web has led to public interest in how to write “good search engines” that find relevant information and how to write “good web pages” that the search engines find and prioritize highly.  Web pages are not the only potential information that we might want to find with search engines; many internet sites now contain information in non-text format such as databases, images, sound, etc.  Most current web search engines will not find this kind of information, but many people would like to search it, and the topic is an important current research area.  An alternative to placing the burden on the user or producer of the information is to do fundamental research on how information in web pages can best be “mined” so that it can be processed effectively by search engines.

This series will include lectures on the technical and cognitive issues of searching the web as outlined above – what is now called the Semantic or Deep Web.  It will involve the disciplines of computing, linguistics, and science:  how humans glean meaning from text and data, and how computers assist in processing human language and assist humans in processing data.   One major question we will ask is how computers can help communicate and interpret meaning, and process information overload, for example, as Google does when it returns “just” the information we seek when we search the web, or as some web sites provide useful aggregations of data or easy to understand queries that help reduce data.

Series speakers will explore the complexity of developing interfaces between humans and the web, and will be selected from among experts in linguistic theory and computer science with the goal of exploring the interaction between the two areas, in particular in terms of how modern computers can be used to help humans process text and data.