next up previous contents
Next: Loosely-coupled Hybrid-Search Solutions Up: Hybrid-Search Previous: Hybrid-Search

Hybrid Emulation

A number of previous efforts have endevoured to create a hybrid-search solution by emulating one type of system on top of another. In [38] and later in [17] a model is described to extend standard relational algebra to allow Non First Normal Form ( NF2 ) relations. Such a system allows for nested information in a relational databases and the necessary power to emulate some IR behavior. This approach is however limited to the expressiveness of relational databases and requires new implementations and new query languages to function. Some databases allow for extension by the user and in [30] a database is extended using abstract datatype (ADT) to allow for textual indexing. However, this system only allowed boolean queries to be performed against the data.

In [21] a method is described for using standard SQL to emulate a ranking IR system. This is achieved by creating additional database tables holding TFIDFgif tables. The argument is that such a solution is more efficient and cost effective than either creating a new type of database or utilizing both an IR and database system seperately. Such a system, unfortunately, suffers from the limitation of the database in handling text specific heuristics (stemming, thesaurus) as well as IR system features such as relevance feedback, and vector based indexing/retrieval.

A similar, albeit Oraclegif specific, experiment is described in [13] . Most databases will endevour to satisfy a set of so called ACIDgif properties. By utilzing an ACID satisfying database (i.e. Oracle), a consistency is ensured between the text of the database and other structured information. Although in this specific case the database emulates an IR system (rather than using IR heuristics and algorithms), such an approach is generally refered to as a tight coupling. Tight coupling implies that the consistency is maintained between the IR and database systems.

Although tight coupling is achievable even when the IR and Database systems are seperate there is a cost in efficiency. The hybrid-search model we propose will be loosely-coupled and we will not enforce atomic transactions on both the database and IR systems. Rather, a lazy updating scheme ensures that the information represented by the two information systems will converge over time for any given document.


next up previous contents
Next: Loosely-coupled Hybrid-Search Solutions Up: Hybrid-Search Previous: Hybrid-Search

Copyright 1998, Eytan Adar (eytan@alum.mit.edu)