Tales of a Semantic Web Skeptic

Right now the world’s premiere semantic web conference is happening in Washington, D.C. As a graduate student of the fellow who’s chairing the conference this year, and working down the hall from Sir Linked Data himself, I’ve had my fair share of semantic web experiences. But my background is not in Semantic Web technology, so joining a group so focused on the semantic web threw some of its core tenets into sharp relief for me.

So here, from the perspective of a human-computer interaction guy, is what I’d like to see changed about the semantic web:

Stop Calling Everything ‘Semantic’

At worst, the term ‘semantic’ in a title can mean “we re-did existing research using {RDF, RDFa, N3, OWL, SPARQL, DBpedia, Semantic MediaWiki},” without a clear notion of why this would be a good thing. The strength of the semantic web is its ability to interoperate heterogeneous data, yet the inclination is to ignore this and work on the problem, any problem, in a semantic web framework. Semantic web research papers can feel like a bunch of hammers running around in search of a nail. And there are plenty of oft-hammered nails: semantic query visualizations, semantic desktops, semantic wikis, semantic ontology alignment, and semantic web service composition, to name a few. Why do these benefit from being semantic, any moreso than taking another approach?

At best, the term is still very unclear about what it implies. ‘Semantic’ should mean more than a language or a framework. It is an idea, and the idea should drive the research. Saying that something is semantic should imply something as clearly as saying that it is a proof by reduction, or a tangible user interface, or a static code analysis technique.

Who’s the User? (And Why Would They Ever Use This?)

That’s not a jab. It composes two specific critiques:

To solve important problems, you need to know who your users are. What are their problems? What biases and constraints do they bring to your system? This is true whether you’re composing web services or creating a Linux desktop. Semantic web technology’s greatest strength and greatest weakness is that it is very general. Too many projects focus on trying to help everybody; but too often, “everybody” is too vague to give you a good foothold, and it trends toward “semantic web-interested people”. This leads to many issues with the research, not the least of them is the Big Fat Graph solution to every semantic web problem and the requirement that I manually author RDF triples. When you’re defining the problem, define the set of users! “Everybody” is too vague; start with some personas or scenarios, or build systems that aim at some subset of the world. This will give you the insights necessary to generalize back out to “everybody”.

Second, there are some serious questions about user motivation. The semantic web suffers from a real cold start problem — how to get all that data into linked format.  Again, no single motivator will work for everybody, so the resulting motivators are so general, or so tied to implicit semantic web assumptions, that few get off the ground. Nobody wants to sit and re-encode their data into semantic web format.  But given a real problem, and the promise of a solution that just so happens to involve RDF, it will happen.

This is why I think Semantic Web UIs is something of a misnomer. It’s like “Java Swing UIs” or “UIs based on a relational database backend and a PHP frontend”. The critical irony of a good semantic web UI is that there should be no indication that it’s semantic. You could do this using a standard database and data model, but it’s easier because it uses semantic web technologies. Again, the interface should flow from the problems, not from the data model (flexible as it is).

I’d love to hear a semantic web researcher’s critique of human-computer interaction. Or your thoughts on my thoughts…

10 Responses to “Tales of a Semantic Web Skeptic”

  • Yes. Use cases are key. Requirements for knowledge representation are contingent on the use case.

    The term “Semantic Web” encompasses so many very disparate things that I feel it’s hard to have a discussion beyond this.

  • Max L. Wilson says:

    the Semantic Web User Interaction workshop is currently diverging over supporting SemWeb exploration (people who are working with RDF and understand the tech) and services for ‘real people’ who dont want to know about RDF at all. So many people are focussed on the former, and its causing a little bit of mixed discussion. perhaps next year we need ‘SWUI for the SW’ and ‘SWUI for your mom’ workshops.

  • Ted says:

    My initial take on this is that the Sem Web crowd, like all communities, has a wide range of members focused on a variety of things. Some of the research is great and some of the research misses the point. The same can be said for HCI, or any subject for that matter.

    Rather than a critique of the semantic web community, I think what you’ve actually written up is a nice start to a set of guidelines about how to identify and present good semantic web research topics.

    Random aside about the UI topic: I totally agree about the necessity and inevitable better quality of use-case-specific UIs, but I think one of the interesting features of self describing data is it does allow for super-generalized, “on demand” interfaces. Even though a lot of these don’t look very good, I think there is a lot of interesting work that can be done to make them look better. The Google search bar is a great example of this: if Google actually a search over a structured graph, how would it determine the right slider bars and UI elements to accompany different types of search? Ideally, this would be generated on demand from the region of the graph you were accessing.

  • Steve Ardire says:

    Michael some valid points you make esp about semweb kool aide drinking ;)

    In short, to become real and tangible the semweb must be interleaved with semantic enterprise which is a whole lot tougher challenge ;)

  • notone says:

    The idea of a semantic web is fundamentally misguided. It began with the idea that if structured correctly machinery would be able to more meaningfully organize content spidered. However, this is a fallacy based on proofs that all taxonomies are inherently limited. The meaning of something is relative to the observer. The word semantic was then picked up by NMD’s (new media douchebags) and used as a weapon to advance an agenda regarding particular formats and validations… (i.e divs vs. tables, etc) The advancers want to put the cost of classifying the information on the producer/publisher. Instead there should be lowest cost on the producer/publisher to encourage publication, and because the value is relative to the viewer, those consumers of the information can bear the cost of organizing and categorizing as it best suits their use… Meaning can’t be assessed in a vacuum. Enough already with the imprecise terminology to bamboozle folks at high priced conferences…

  • J Hendler says:

    Good article. HCI is one missing piece among many.

    My response :
    http://whatisprogress.com/en/content/defense-semantic-web-again

  • Michael Bernstein says:

    @Ted: there is some interesting debate within the HCI community about autogenerated interfaces too. Krzysztof Gajos at Harvard has done some of the most interesting work lately in this space, both adapting an interface to different devices and to people with different motor capabilities.

    I would still argue that there’s a difference between “designed for everything” and “adaptable to many uses”. Excel is adaptable to many uses, but at its core it is still fundamentally about manipulating 2D tabular data. People use emacs for all sorts of things, but it is still designed for efficient text editing first.

  • Mike Brzozowski says:

    Amen! The people genuinely interested in doing work on improving Web usability are getting drowned out by RDF fetishists and people who take for granted the notion that all information is somehow organizable under some set of universal ontologies. To be fair, the same happens in other fields like AI and computer systems, as people get caught up trying to improve their performance on synthetic benchmarks that don’t reflect real-world situations.

    The challenge incumbent upon the semweb community is to demonstrate how applications of their techniques improve user experience today–not in some utopian, RDF-defined future fairyland.

  • [...] zur Folge haben können, ist der kritische Blick auf solche Gefahren wichtig. Im Post “Tales of a Semantic Web Skeptic” von seinem Haystack Blog leistet Michgael Bernstein eine solche konstruktive Kritik. Diese [...]

  • [...] RDF is in Drupal core and the Semantic Web conference in DC, I wanted to take time to respond to “tales of a semantic web skeptic”. Healthy criticism, and a good [...]