next up previous contents
Next: Query Model System Components Up: The Problem Previous: The Problem

A Model for Searches

  Figure gif illustrates our view of the query process. This feedback system abstractly models aspects of current search mechanisms, and illustrates where this work hopes to improve on them.

   figure91
Figure: The query model

Let's look at this system qualitatively for a minute. We start off with a data corpus (the data in the figure) which contains some information the user would like to find. The first task a user performs is to combine their information need (i.e. what they are searching for), with any hints they may have. For example, if the user is searching for some document (their information need), they may also remember that Bob wrote it last year and that it was about ``apples'' (the hints). The user combines the information need with the hints to form (possibly just in their minds) some informal query in Q1: ``I want the document that Bob wrote last year about apples.'' Hints are not necessarily directly related to the information need of the user. For example, the book a user is looking for on probability happens to be red. ``The book is red,'' is a hint the user can provide to the help the system to find the book. However, the fact that the book is red has nothing to do with the information need of the user which can be, ``information on Markov chains.''

Ideally, we would like a user to be able to informally phrase the combination query of information need and hints to the system. Unfortunately, systems today are incapable of handling arbitrary natural language queries, so it is necessary for the user to generate some formal query based on the language specification of the information system. This corresponds to the function Q2. The previous query might then become (if our information system were a SQL database), ``SELECT * FROM docbase WHERE author = 'Bob' AND date > 01/01/97 AND date < 01/01/98 AND topic = 'apples.' ''

The user then passes this formal query to the information system, I. The information system in turn generates a set of responses that satisfy the user's formal query. If it is capable of doing so, the system may also rank the responses as to their likelihood of satisfying the query. We will get back to this. This primary result set is then passed to a post-processing function, F. This function acts as an additional level of computation on the initial result set as specified by the formal query. The general idea is that sometimes a user wants a single top matching response, or alternatively sometimes they want a summarization of all the responses, or possibly some other representation (sorted by date, for example). In effect, we generate a new document that reports on the content (or location) of other documents. Portions of the formal query that contain post-processing instructions are passed to F for evaluation.

Finally, once we have this secondary result set it is necessary to turn again to the user. The user must evaluate what was returned by the system based on their informal query. If the information satisfies their need we are done. Otherwise, the user must form what we will call D (delta, the mathematical symbol for change). This D corresponds to some additional restrictions or modifications the user will apply to his original informal query function, Q1. Let's go back to the initial query for an example. If Bob wrote many papers last year with a majority being about ``apples'' and ``oranges'' the system will return many matches. If the user is only interested in the documents that are exclusively about apples, they will have to revise their query to reflect this need. The user can now say that they are interested in, ``all documents Bob wrote last year about `apples' but not `oranges.' '' With a new informal query in hand, the user will now repeat the previous walk through the system. What we would like to do is minimize the number of iterations so that a user gets what they want much more quickly. However, before discussing our particular solution it would be valuable to understand how current solutions fit in this model, where they do the right thing, and where they fail.


next up previous contents
Next: Query Model System Components Up: The Problem Previous: The Problem

Copyright 1998, Eytan Adar (eytan@alum.mit.edu)