Keynote at the European Semantic Web Conference Part 1: The State of End User Information Management
I’ve just returned from the European Semantic Web Conference, where I gave a keynote talk on “The Semantic Web for End Users”. The slides are here . My talk addressed the problem that has interested me for eighteen years: making it easier for end users to manage their information. The thesis was that
- The current state of tools for end users to capture, communicate, and manage their information is terrible (this post), and
- The Semantic Web presents a key part of the answer to building better tools (tomorrow), but
- Not enough work is being directed toward this problem by the community (Monday))
Since I had a lot to say (217 slides) I’m breaking the summary into three separate posts aligned with these three bullets. Come back tomorrow for the next.
The Situation is Dire
I began my talk by trying to convince people of how bad things currently are. For this, I didn’t rely on my own work, but on presenting Voida, Harmon and Al Ani’s fascinating CHI 2011 talk on Homebrew Databases. Thanks to Amy Voida for sharing her slides and her script! Choosing a specific domain, the authors spent a bunch of time in volunteer-driven nonprofit organizations of varying sizes. They extensively interviewed the volunteer coordinates—responsible for managing information about volunteers, skills, needs, and tasks—to learn about how they did their jobs. The results were painful to hear. Because there was no application specifically designed to manage the information these coordinators used, they were forced into a baroque assemblage of excel spreadsheets, outlook lists, paper, index cards, and binders. With this mix of tools they had terrible versioning problems, wasted inordinate amounts of time on data entry and transfer, and struggled to organize, query, and visualize their information.
The tasks these volunteer coordinators wanted to support were not complicated—they weren’t doing Big Data Analytics. Rather, they were trying to answer elementary questions like “which volunteers are available for the following activity” or “what’s a summary of all the work this volunteer has done.” Questions that would be trivial for a good database administrator with a well-maintained SQL database. Unfortunately, few users fit that profile.
I consider it a major embarrassment for all of us in databases (and the Semantic Web) that this is the current state of the art. This paper ought to be required reading for anyone in these fields, helping us to realize that we’ve got our heads in the clouds while people are stuck in the dirt. For those who argue that these users should “know better” and learn the right database tools for managing their data, I defer to famed designer Don Norman, who observes in The Design of Everyday Things:
When you have trouble with things-whether it’s figuring out whether to push or pull a door or the arbitrary vagaries of the modern computer and electronics industries-it’s not your fault. Don’t blame yourself: blame the designer.
The designer, of course, is us.
What’s the Problem?
The Homebrew Database paper focuses on symptoms, but I have a strong opinion about the causes. When I first showed up at MIT, I intended to do research in information retrieval. But I rapidly concluded that the real problem wasn’t retrieval. Rather, it was that our computers were actively getting in the way of people recording and organizing their information. If can’t record it, they certainly can’t retrieve it!
In particular, in our traditional model each application is developed with a fixed schema in mind. This schema determines both what information can be stored and how it will be presented and manipulated. Any user whose information is or ought to be in a different schema is out of luck—they can’t record it properly (my physical therapists recently observed how frustrated she was struggling to enter all the data about her patients in the electronic medical record—until she discovered she could put it all in the comments!). Thus, users who have these nonstandard schemas are generally forced into the small set of tools that can handle arbitrary schemas, most frequently spreadsheets. The Homebrew Database work highlights how severe the consequence are and even observes that schemas frequently need to change on the fly as underlying information needs change.
Fixed-schema applications also pose a severe barrier for users who want to connect information from multiple applications—for example, linking a person in your address book to the music soundtracks in your media player which that person has composed. Since these applications are unaware of each others’ schemas, they can’t do anything with (or even refer to) each others’ data. I discuss this issue further in a paper on data unification.
A Semantic Web Fix?
Now that I’ve argued that there’s a severe problem to be solved, the next step is to propose an approach to solving it. Tomorrrow, I’ll argue that ideas at the core of the Semantic Web offer a way forward, and justify my claim with a number of example Semantic Web applications targeting end users.