next up previous contents
Next: Acknowledgments

Hybrid-Search and Storage of Semi-structured Information

Eytan Adar

Department of Electrical Engineering and Computer Science
Master of Engineering in Electrical Engineering and Computer Science
May 1998

David R. Karger Associate Professor
Lynn Andrea Stein Associate Professor
Arthur C. Smith Chairman, Department Committee on Graduate Students

Also available as: Postscript (1500k), GZipped Postscript (310K)

Given today's tangle of digital information, one of the hardest tasks for information systems users is finding anything in the mess. For a number of well documented reasons including the amazing growth in the Internet's popularity and the drop in the cost of storage, the amount of information on the net, as well as on a user's local computer has increased dramatically in recent years. Although this readily available information should be extremely beneficial for computer users, paradoxically it is now much harder to find anything.

Many different solutions have been proposed to the general information seeking task of users, but few if any have addressed the needs of individuals or have leveraged the benefit of single-user interaction. The Haystack project is an attempt to answer the needs of the individual user. Creating such a system requires solving two problems. Half the problem addresses the manipulation of the data into a queryable format. Once the user's information is represented in Haystack, the other half of the problem centers around our desire to answer the highly varied questions a user may ask about this information. In this thesis we will propose a means of representing information in a robust model within Haystack and we will describe a corresponding mechanism by which the diverse questions of the individual can be answered. This novel method functions by using a combination of existing information systems. We will call this combined system a hybrid-search system.

Copyright 1998, Eytan Adar (