PhD: Part 1

Fig. 1: An early version of SIEUFERD, the Schema-Independent End-User Front-End for Relational Databases.
It’s my sixth year in graduate school; my committee has been formed, my PhD thesis proposal has been submitted, and I am coding along on SIEUFERD, the research system from which I hope to squeeze my remaining research contributions. The project and [...]

Keynote at ESWC Part 2: How the Semantic Web Can Help End Users

I’ve just returned from the European Semantic Web Conference, where I gave a keynote talk on “The Semantic Web for End Users”.   My talk addressed the problem that has interested me for eighteen years: making it easier for end users to manage their information.  The thesis was that

The current state of tools for end [...]

The Semantic Web needs a MySQL

One thing was clear in the comments of many industry-facing participants of ISWC 2010: a big impediment to adoption of semantic web technologies is the lack of an off-the-shelf triplestore that “just works.”
There are many other problems, of course: RDF an awkward format when it comes to real world programming because the graph model [...]

Why All Your Data Should Live in One Application

A couple of days ago Adam Pash at Lifehacker posted a criticism of “everything buckets”—applications aimed at gathering every kind of information you work with into a single place.   I can’t resist responding as the article touches on some of the issues that have framed my past 15 years of research into information management.  It [...]

Rich Visualizations in your Wordpress Blog

Datapress is a Wordpress plugin that makes it easy to enhance your Wordpress blog posts with rich interactive visualizations such as maps, timelines, various charts, and sortable lists, all with interactive filtering and faceted browsing.   Datapress uses the Exhibit framework to offer a collection of rich structured data visualization elements that can be dropped [...]

On a Few Deadly Data Sins and the Entropy of Open Data

I just ran into a lovely and frustrating open-government-style map of stimulus funding put together in Colorado.   The same tool is used in a number of other states, listed in Brady Forest’s blog post at O’reilly Radar.  Lovely because its always nice to look at maps; frustrating because that’s all I can do.  Where’s the [...]

Notes from NoSQL Live Boston

I was excited to sit in on NoSQL Live Boston yesterday. Thanks to 10gen for hosting and all of the speakers for putting the time in!
The NoSQL community is an interesting one. I was pleased to see Dwight Merriman suggest that the community look past its awkward and misleading name when figuring out how [...]

Building a Social Data Commons

Inspired by Ted’s vision of what he’d like to see happen to data.gov, I decided to have a try at my hopes for it. Ted’s desires for data.gov are all ones that I agree would make the data more accessible. I would now like to discuss what else I might want in a world where [...]

Spreadsheets vs. Relational Databases: Bridging the Gap

For non-programmers, spreadsheets are usually the option of choice when it comes to keeping track of non-trivial amounts of structured data. This is seen in all kinds of settings ranging from the business world to public administration and academic research. Spreadsheets, however, can only capture one kind of data structure: separate tabular [...]

In Defense of a Semantic Web Wild West

A month ago Stefano Mazzocchi published an interesting article on data reconciliation (detecting when two identifiers refer to the same item, and merging them) where he advocated a more centralized “a priori” approach (trying to keep the identifiers merged at the beginning).  I posted a response arguing the value of a more anarchic “a posteriori” [...]