A Case for a Collaborative Query Management System

This is a CIDR presentation by Nodira Khoussainova of University of Washington arguing for a collaborative repository of complex SQL database queries.  Sounds like they want co-scripter for SQL.

There’s a problem of hunting through all the queries to find the one you want.  They want effective search and browsing, and also assistance in composing new queries.    There are challenges:

  • queries are not just strings, but complex objects with inputs, outputs, and semantics.  2 similar queries can have very different outputs, and 2 different queries can return the same
  • typical search problem: need to avoid giving too many matches
  • efficient algorithms (this is a database conference after all)

An application they have in mind is scientific data management.  There’s tons of data and lots of (shared) data analysis with complex queries that are freqently evolving.

Consider the scenario of a novice user trying to create a query, given a large repository of past queries by others. He’ll try to find a perfect match but will probably need to take something close and then modify it.  There must be a metaquery language for describing the kind of query you want.  Since that query was probably built over time, there may be many versions that evolved, and it can be useful to see all the different versions and find the best ones for his use.   It willbe useful to explain to the user how these versions are related, e.g. this refines that.  One needs to watch out for the metaquery being more complicated to construct than the query they want to find.  One approach is “partial query”—for the user to build as much of the query as they can, then look for other queries that are similar.

Comments are closed.