Notes from CHI: “Ambient Help” for AutoCAD, Photoshop etc.

Among the other talks I attended at CHI 2011 was Justin Matejka et al.’s “Ambient Help” (paper and video here). The Ambient Help system is designed for complex desktop applications like AutoCAD or Photoshop, which tend to come with a steep learning curve, and which tend to require a continued learning process even from more experienced users that use the product on a frequent basis. Much of the system’s novelty comes from its unobtrusive nature; the system runs on a separate monitor from the application in use, and uses a context-sensitive search heuristic to display articles and tutorial videos that might be relevant to whatever the user is currently doing. If the user sees a resource of interest among the several suggested ones, it is easy to open or play a particular one. Conversely, it is trivial to ignore the help system entirely by simply looking away from the second monitor. Contrast this with MS Office “Clippy”…

The project includes a user study on AutoCAD, which shows the search heuristic to be quite successful in finding help resources that the user ends up pulling up for more details. For the creative, not time-constrained task, users visited 2.6 times as many interesting resources with the real-time Ambient Help system enabled as with a corresponding manual system (YouTube search, standard online documentation), without investing a larger portion of their time interacting with the help system.

Database papers at CHI

There is little I like more than a fine cheese and fresh-baked bread. Still, to fill the rest of my day without expanding my waistline, I go for a mix of databases and human-computer interaction. That’s why I was excited to see several database-oriented papers presented at CHI. While many papers contained some amount of data, I’ll stick to the three that are unquestionably of interest to the databases community.

The first paper was for the social scientist in all of us. Amy Voida, Ellie Harmon, and Ban Al-Ani presented Homebrew Databases: Complexities of Everyday Information Management in Nonprofit Organizations. Nonprofits are arguably some of the most difficult database users to design for. They have minimal resources, rarely employ fulltime technical staff, and solve non-core problems as they show up. This practice leads to homebrew, just-functional-enough solutions to many data management problems. The authors provide an interesting qualitative study of how nonprofits manage volunteer demographic and contact information. They provide descriptions of the homebrewed, often fractured collections of data stored in several locations. Reading this paper, I couldn’t help but think of how perfectly these homebrewed databases resembled Franklin, Halevy, and Maier’s dataspaces.

Sean Kandel presented Wrangler, a project he’s been working on with Andreas Paepcke, Joe Hellerstein, and Jeff Heer. Wrangler lets users specify transformations on datasets by example. Each time a user shows Wrangler how to modify a record (or line of unstructured text), Wrangler updates its rank-ordered list of potential transformations that could have led to this modification. Wrangler borrows concepts such as interactive transformation languages from Vijayshankar Raman and Joe Hellerstein’s Potter’s Wheel. Its interface has a taste of David Huynh and Stefano Mazzocchi’s Refine as well as Huynh’s Potluck. Wrangler’s novelty comes in combining the interfaces and transformation languages with an inference and ranking engine. Since Wrangler is hosted, it is also capable of learning which transformations users prefer and improving its rankings over time!

The last slot goes to our own Eirik Bakke, who presented Related Worksheets along with David Karger and Rob Miller. Related worksheets make foreign key references a first-class citizen in the world of spreadsheets. Just as spreadsheets secretly made every office worker capable of maintaining a single-user, single-table relational database, Eirik has secretly enabled those workers to make references between spreadsheets without having to program. While adding foreign key references to a spreadsheet requires a simple user interface modification, its implications on how to display multi-valued cells in the spreadsheet are significant. Read the paper to see Eirik’s hierarchical solution to this problem!

Keep it up, data nerds! Soon we’ll be able to start a data community at CHI!

There Are Bad Systems Papers

There’s a lot of discussion about the right way to evaluate and support systems research in SIGCHI. Maybe too much. (I’m allowed to say that because I contributed to it, right?) But for this to be a productive conversation, we need to tackle the other half: what makes for a bad systems paper?

I say bad paper, rather than bad research, because often this is about framing and not the actual work. My conversations at CHI and throughout the alt.chi process helped draw out some of the common killer problems that HCI systems papers run into. These are legitimate problems with a paper, and we need to own up to them if we want our work to be taken seriously.

Issue 1. My Contribution is the System

Pete Pirolli hit this one on the nose at the alt.chi presentation. Systems authors often frame the technological artifact they built as the entire contribution of the paper. The fact that I built a system, say one called ACRONYM, is largely immaterial. In a way, it’s part of the evaluation: ACRONYM is proof that the ideas can be instantiated. But what are the ideas driving the system design? In order to learn something from the paper, we need to focus on the ideas rather than the system when describing our contribution.

Issue 2. My Study Proves That This Is Unquestionably The Best

Many social scientists who I talked to complained that systems papers often overclaim their results based on a small study. If you read a CHI paper by one of your favorite social scientists, they are very good at clearly scoping what can and can’t be concluded from a study. CS has a way of always claiming that my ACRONYM system absolutely buries all the competition. If we are a little more careful in our claims, I think it will help many systems papers on the bubble.

Others? Suggestions?

CHI 2011′s RepliCHI Panel

This past week at CHI, our very own Michael Bernstein participated in a panel discussion about the role of replication and reproduction in the CHI community. Thanks to Max Wilson, the panel coordinator, I got the opportunity to log the event and live-tweeted the whole thing; here are my notes.

Max starting things off, with these comments:

  • Replication is a cornerstone in some fields, in CS it’s often a benchmarking tool.
  • HCI often suffers from generalizability, but replication to fix that problem can be very time consuming.
  • We also aren’t entirely a science community – would you try to replicate art?

Wendy Mackay was the first invited panelist:

  • CHI crosses disciplines, and so do attitudes about replication.
  • We often draw from experimental psychology (start with a model, revise the model, and replicate things in between), as well as from ethnography (observations and re-observations).
  • These approaches focus on developing theories or knowledge about the world, whereas design focuses on building artifacts.
  • We also draw from engineering and computer science – engineering has repetition, but much of CS does not.

Harold Thimbleby followed:

  • He promised to share a core science background (as opposed to Wendy’s psychology framing).
  • “The only reason you’re in this room today is because you’ve got hope [...] to live [...] and hope for the future of CHI.” (maybe paraphrased a bit!)
  • CHI hopes to change the world for the better. In order to do it with confidence, we often use statistics measuring our confidence.
  • We get excited at conferences by ideas, and we go home and try to use those ideas. That’s replication.
  • Those iterations cause evolution-like improvements of ideas and knowledge.
  • Deliberate reproducibility is good science, and it can train young scientists and fix issues.
  • “Non-reproducibility is cheating” – if we don’t make the process needed to reproduce work clear in papers, we fail as authors.
  • In reality, we need to get people to use our ideas. We write papers to spread our ideas.
  • “Sadly, most of what we publish isn’t reproducible.”
  • A third of papers published in a machine learning journal weren’t reproducible (this was determined by a survey of authors in that journal).
  • HT replicated this by asking three other journals and found the same thing – this is a problem in computer science, not just in HCI.
  • We can look at post-war cargo cult examples as a parallel to our work – they built planes and other war paraphernalia hoping that it would result in cargo drops, but missed the point. Similarly, we often neglect to reproduce things at a useful level.
  • We do have reasons for not being reproducible, including business ones. A study of different Casio calculator models saw different answers to arithmetic problems, which was obvious not something Casio wanted published.
  • Being reproducible on consumer devices can be really detrimental to a business.
  • “Go forth and reproduce [create new scientists] and be reproducible [with your work].”

Next up was Ed Chi, with the point of view of industry research:

  • “There is more to replication than simply duplication.”
  • Early contributions to the field came from computer scientists and cognitive psychologists.
  • In a memo establishing HCI research at PARC, it was evident that there was a need to establish HCI as a science.
  • The intellectual heritage of HCI comes from Vannevar Bush and JCR Licklider, augmenting cognition.
  • Our background comes from psychology, where replication is the norm (echoing Wendy).
  • Psychology teaches students early on to design good studies.
  • In CHI 97, there was a browse-off – the hyperbolic browser won, but replication attempts showed no clear winner.
  • Individual differences in subjects where overwhelming anything in the design of the browser, showing the value of replication as a tool to more fully understand what was happening.
  • This first experiment at CHI 97 was just the beginning of something bigger, and that’s why replication was needed, and is still needed.

Michael spoke next on behalf of grad students everywhere:

  • He couldn’t speak for everyone but used an “unassailable, extremely scientific data collection protocol” (this is facetious) and got responses from 93 students (his social network and student volunteers).
  • 83% of grads hadn’t ever replicated a study, 62% said “hell no” they never would replicate a study or a system.
  • One response said “I’m more creative than that”, another said “New studies confirming old studies have no chance of publication.”
  • There’s a general perception that reviewers don’t feel that work is necessary, and that it isn’t novel.
  • “The grad student must conform”, and so, since no one’s publishing replication work, there isn’t any more being published.
  • He also solicited haikus – “Think analyzing / CMC is tough? Try it / reproducibly” and “Repeat to be sure / We stand on giant’s shoulders / But do so on faith.”

Dan Russell from Google, speaking with the experience of someone with access to large data sets:

  • What CHI insights can we replicate?
  • Replicating a measure should be straightforward, but it’s not in our very diverse community.
  • The knowledge needed for replication sometimes gets left out of papers.
  • Changing things slightly, such as in wording or font, can dramatically change the ability to reproduce work – so does a change on the web.
  • DR was conducting a study about finding difficult-to-locate information online, and suddenly, everyone got WAY better… because someone had posted the answer online on a Q&A site! Changes that are out of our control online can dramatically affect reproduction.
  • Google is kind of a Large Hadron Collider. We can’t reproduce the LHC studies without our own, so we must take them on faith. Likewise, we don’t all have access to Google’s huge data sets or user bases, as so we must take some of that on faith as well.
  • “Ultimately, we are a faith-based community. And that’s the nature of science.”

NB that the panelists posted statements beforehand on replichi.org; look there for more detailed summaries.

There were several questions and comments that prompted discussion. I’ve gotten them down here as best I could. Apologies if I’ve misquoted or misattributed anything!

  • Gary Olson, from UC Irvine – Wendy said we should replicate and extend. [...] Extension is critical.
    • Wendy – “I of course agree. But there’s a disciplinary issue.” Something are relatively easy, depending on what their intellectual heritage is, some can’t be done.
    • Ed – We often place the responsibility of generalizability on the author. He or she must make that claim. In other fields, that burden falls on the reader.
  • Sharoda Paul from PARC – We must address the interdisciplinary nature of CHI. How can we manage the expertise and backgrounds between reviewers?
    • Ed – depending on the person, there can be a sense of “why should we waste our time on replication?” – but replication can heighten understanding.
    • Ed – part of the goal of this panel is to change the between-reviewers issue.
    • Harold – we should note that there are different types of reproducibility:
      1. Replication work done to acquire skills and to learn.
      2. Just redoing work (because of a failure to immerse oneself in literature), which is not publishable. (This is the bad kind of reproduction.)
      3. Writing papers honestly to be reproduced.
      4. Reproduction with an adaptation to a different area, or an extension on previous knowledge.
    • Wendy – part of it may also be finding ways to publish more philosophical things. PC meetings are a place where things like this are discussed as well.
  • Eric Baumer from Cornell – “Replication is not reproduction.” There are different kinds of replication; we should consider what replication means.
  • Lorrie Cranor from CMU – SOUPS gets around paper length issues by including appendices with information for reproduction.
    • Wendy – we should think about who will be reproducing the work as well – we should let people reproduce work in products, or in things that affect the real world.
    • Wendy – of course there are IP issues, but this could be part of our long term goal. We don’t pursue just science, but world-changing innovations.
    • Michael – Rebuilding systems is so, so hard. We often only have screenshots to go off of, and there might even be errors in the paper. Replication happening in Rob Miller‘s HCI class led to a discovery of a constant being off by a factor of 10 in a noted paper.
    • Harold – Papers can also be about inspiring, rather than being about reproduction… or they can be entirely open-sourced.
    • Harold – we should be clear about how reproducible we intend things to be in our papers.
    • Ed – paper limits come from the publishing model, but in the digital world, we need to now change the community standard.
  • Question from an unknown person (sorry! let me know if it was you!) – When you replicate and find different results, what do we do? Some reviewers might be insulted. Do we reproduce things specifically to falsify others’ work?
    • Michael – that feeling echoes grad student opinions, and it’s worsened by the assumption that if you find errant results, you messed up, especially if it’s work by an important researcher.
    • Max – sometimes we reproduce things and it confirms surprising results though – the value of the content may change the value of reproduction.
    • Wendy – the hope is that there are multiple reviewers, and this hopefully means that any controversy is viewed very clearly.
    • Wendy – controversial findings like that are more interesting than others.
    • Michael – Unfortunately, we don’t always know why, and that causes increased skepticism.
    • Michael – It’s good when intro classes include replication of results. It can demystify things.
    • Wendy – I have more faith in program committees than to believe that good papers would disappear if they’re controversial.
  • Lora Oehlberg from Berkeley – Design research discusses failures as well as successes. Do we encourage people not to replicate pointless results, which could be considered failures?
    • Replication of results can improve the quality of data.
  • What’s the role of releasing code in systems work?
    • Ed – “Ownership of code [and data] has been a way research territory is protected. Monetization might be the root of all evil.”

Panelists shared their final thoughts:

  • Max – perhaps we need an alt.chi or similar session called repliCHI, a place for people to publish work like this.
  • Wendy – that might be possible! “I think we should encourage students to replicate in coursework” and then publish like that.
  • Harold – Think of how you can “build something that improves reproducibility” – we can change the models of publication this way.
  • Ed – We must change the HCI curriculum. It doesn’t always [though there are notable cases where it does] include stuff drawn from psychology. We can always experiment in conferences.
  • Michael – There are techniques to “replicate” systems quickly, like as part of a prototyping process, that can inform our design, and we shouldn’t neglect these.
  • Dan – I almost always ask interns to reproduce results. Perfect reproductions are boring, but they’re almost never perfect, and then we learn something.

Eating our own Crowdfood

Recently the CHI workshop on Crowdsourcing and Human Computation got some press courtesy of Jim Giles and New Scientist. Near the end of the workshop, the working group on Future Directions and Community had some interesting suggestions that I’ll echo here.

Can we take some of the crowdsourcing tools and techniques we have developed as a community and put them to use in our own publishing and review processes?

  • Use online tools to disseminate research quickly. Arxiv.org plays a part of this role, but it’s more of a database than a venue.
  • Significantly shorten review periods. What if research could come back with an initial review 48 hours after submission? We have early evidence that fewer reviews may be necessary in the early stages.
  • Maintain living documents where the authors can publish errata and appendices.
  • Cross traditional disciplinary boundaries so that authors don’t need to choose between publishing in a human computation venue and a “home” venue.

The bigger question put to the group was: should crowdsourcing and crowd computing develop into their own disciplines, or continue to jump around between existing conferences in the ACM, IEEE and AAAI?

Notes from CHI: Health Care Panel

I’ve been attending the CHI conference in Vancouver this week, presenting some of my work on database user interfaces. It was interesting to attend Tuesday’s “Re-Engineering Health Care with Information Technology” panel and hear about what appears to be one of the biggest application areas for database UIs on the planet: Electronic Medical Records (EMRs). Ben Schneiderman referred to the thousands of different systems that are currently used for communications between and within health care institutions as a giant “Medical Internet” that indirectly serves more Americans (94%) than the regular Internet. US health care spending is currently far higher per GDP (and relative to performance metrics such as life expectancy and infant mortality) than that of any other country in the world, and it is clear that effective IT use must be at least a part of a solution to this problem.

I took note of several interesting anecdotes from the panelists:

  • In many cases today, EMRs actually disrupt the workflows of health care workers. A physician may log onto their computer system in the morning, browse through several poorly adapted views of patient records in order to find the information she needs for the day, and then write it down on paper. At the end of the day, she (or her assistant) returns to the computer to type in handwritten changes to the various records involved.
  • Thomas Payne, MD, talked about the Computerized Patient Record System (CPRS) of the Veterans Health Administration. The CPRS has been recognized as an example of a highly successful large-scale EMR system in the US. We got to see a screenshot, and it’s actually a good old text-based DOS interface (or at least it used to be in 1997—fair enough).
  • There are between 300 and 600 vendors of EMR systems in the US, and they differentiate themselves by each having a separate architecture and user interface. Thus, a physician who might work at one hospital for three days a week and another for two will need training in two completely different systems.

Which CHI talks should I see?

Although I’ve been going to CHI for a few years, I still feel like something of a foreigner, not certain which talks to attend.  Many of my friends and colleagues probably have a much better idea than I of which talks are given by speakers I would like and which offer insights I would find particularly valuable.  So I try to ask around, but I often get the information too late.

So I convinced my students, Michael Bernstein and Adam Marcus, to build a system to help me out.   We connected our FeedMe recommender system (presented in a paper at CHI last year) to the CHI program presented in Danny Soroker’s Eventmap.  As you build your own personal program of talks to attend, you can also recommend any you think I (or any of your other friends) will be interested in.    I hope you will.

The Details

Eventmap already let you browse for talks you might attend (I suggest using the table view, which shows all the abstracts) and click on them to add them to your own schedule.  Now you’ll also get a “recommend using Feedme” button.  If you click it then you’ll be able to specify email addresses of friends who’ll be interested.  FeedMe will take care of notifying them of your recommendation and incorporating it into their personal eventmaps—if they log into FeedMe they’ll see a little green bubble over each talk that’s been recommended to them. A convenience of Feedme is that after a little bit of practice, it will start to guess which of your friends you’re going to recommend a particular talk to, and let you do so with a single click instead of typing in email addresses.

FeedMe also works as a standalone system; you can use it recommend any google newsreader story, or any arbitrary webpage, to any of your friends.  You can find details on the FeedMe site.

The Motivation

Feedme reflects our interest in friendsourcing—getting your friends to help you in crowdsourcing workflows that rely on their knowledge of you.   While I wouldn’t expect random crowds to do a very good job recommending information to me, I can hope that people who know me and my interests well can do a better job than any pure-computer (e.g. machine learning) system.  So please, while you’re looking over the CHI program to plan your attendance, if you see a paper you think I’ll like, fire off a recommendation to me using Feedme, and I’ll thank you for it (every feedme notification includes a one-click thank-you button).  And if you’d like to receive some recommendations, tell your knowledgeable friends about Feedme!

Who’s answering your questions?

Over the course of the last year or so, I’ve been looking at the way people ask and answer questions on Facebook. Much of this work happened with the phenomenal Merrie Morris and Jaime Teevan (haystack alum!) at Microsoft Research.

I’ve been interested in the ad hoc way people ask questions as their status messages (not using the Facebook Questions app), and how it signals some unmet needs in the search (and perhaps social) space.

We’ve identified the types and topics of questions (people ask recommendation, opinion, and factual knowledge questions about things like technology, entertainment, and home & family), motivations for asking questions (in order: trust, subjective questions, belief a search would fail, specific audience, connecting socially, and faster answers), how phrasing affects answers (the more explicitly you ask, and the more you scope your answer to a specific group or eve anyone, the more and better responses you get), and how asking questions in status messages compares to searching (it’s not as fast, but it’s a great supplement).

Lately, I’ve been looking at who answers facebook questions. For instance, in the question I posted above, how close are these friends?

The thing that prompted this direction was work done by famed sociologist Mark Granovetter. He’s known, among other things, for his seminal work on the strength of weak ties, which identified that a lot of useful information came from weak ties, acquaintances, rather than strong ties, or close friends. I’ve been interested in finding out if that’s the case when asking Facebook questions as well.

Thus far, I’ve done a very interesting pilot study – results from the study indicate that, instead of weak ties giving you more useful and helpful responses, strong ties do. We’re currently in the process of testing this hypothesis with a bigger and better study – please email me if you’re local and are interested in participating.

One of the interesting topics in designing this has been understanding what makes an answer to a question useful or helpful – in some of our prior work, people thought answers that were unconstructive were helpful, even though they don’t seem to be on the surface. An example of answers like this for the question I linked to above would be something like “Oh Joanna Newsom! Cool!” – it’s supportive, and on-topic, but certainly not an answer to the question I asked.

We’ve thought about whether or not you might have seen the answer before as a marker of helpfulness, how on topic it is, how much you trust it, how careful you’ll be about verifying it, and other things like that. What makes an answer useful or helpful to you?

A Proposal for Increasing Evaluation in CS Research Publication

I attended the VISSW 2011 workshop last sunday.  It was fun, but a few of the papers exhibited a painfully familiar pattern: they put together a plausible-seeming user interface but didn’t evaluate it with a user study.  I left frustrated, with no sense of whether the ideas of the interfaces would be good or bad to incorporate in my own work.  With the system already implemented, other researchers are disincentivized from implementing (it wouldn’t be novel) so they can’t evaluate it.  Thus, if the original researchers don’t do the evaluation, nobody will.   This is a not uncommon complaint in computer science—our field doesn’t seem committed to following through with evaluations of the ideas they invent and implement.   Some faculty at Stanford have even created a course aimed at teaching students how to properly evaluate their research systems.

So here’s a proposal for improving the incentives a little bit.  Change the submission requirements for conference papers: they have to contain the system description and the hypothesis to be tested, along with a detailed evaluation plan.   Papers are then evaluated and accepted on the basis of a commitment to execute the evaluation plan (and update the paper with results) before the conference but after acceptance.

This approach would have several benefits.

  1. Researchers could defer the work of evaluation until their submission is accepted.   Once it’s accepted, they have strong motivation to do the evaluation (else the paper cannot be presented).   For work that turns out not to be publishable, the evaluation work is not wasted.
  2. The evaluations would take place after the submission deadline, meaning work on the system could continue right up to that deadline.  This gives us something to do in the “dead space” between acceptance and presentation (which is forced upon us by the long lead time for required for travel planning).  The work presented at the conference would be “fresher”; the long lead time on conference submission would have less impact on the publication of timely results
  3. This approach would also address the recently popularized problem of a bias towards positive-outcome evaluation that may lead to incorrect claims of statistical significance in outcomes.   If reviewers consider a paper that contains only the system and evaluation procedure, they will be forced to asses the paper purely on the grounds of whether the proposed system is interesting enough to be worthy of evaluation.  If it is, then the paper should be accepted regardless of whether the outcome of that evaluation is positive or negative.   If it is not, then the inclusion of a positive evaluation should not change the rejection decision.

Turning to logistical concerns, this approach means that the paper is not finalized until shortly before the conference (a couple weeks, to give reviewers a chance to confirm that the evaluation plan was followed).  But as more conferences move towards electronic-only publication, this schedule becomes feasible.  And this scheme wouldn’t cover e.g. multi-year longitudinal evaluations.  But it would certainly cover a large number of the papers with short (inadequate?) user studies appearing in our HCI conferences.

Of course, there’s the simpler approach of requiring evaluations at submission.  This meets the primary goal of having systems evaluated, but loses the three benefits I’ve outlined above: researchers invest energy evaluating systems that would be rejected independent of the evaluation; the evaluation work will be older/staler by the time of the conference, and the bias of reviewers to accept positive results would continue.

The Trouble with {The Trouble with Social Computing Systems Research}

A few weeks ago, I finished writing a thought piece with Mark Ackerman, Ed Chi, and Rob Miller about the state of systems research in social computing. It grew out of conversations with a lot of researchers in the area, and examines questions of novelty, evaluation, and the industry/academia question in the field.

I submitted the paper to alt.chi, where it generated quite a bit of discussion in the alt.chi open reviewing process: twenty two reviews (twenty one “very high interest”s, and one “high interest”). To be honest, I was really blown away by the positive response. I chose alt.chi as a venue because wanted to get a lot of feedback, and that worked out in spades.

In the spirit of alt.chi’s open process, I’m now going to open up some of those reviews back to the community so that I can make the paper even better. (While I wouldn’t do this kind of thing for typical paper, I think that alt.chi reviews are written with a higher expectation of openness, so it’s OK in this case.) These are some of the most cogent points I took away from the feedback, and they are what I’m going to try and address before Friday’s final deadline. I’d love to see continued discussion here in the comments if you have thoughts. There are a lot, but I’ve tried to highlight main points.

Here’s the submitted PDF, and the original abstract:

Social computing has led to an explosion of research in understanding users, and has the potential to similarly revolutionize systems research. However, the number of papers designing and building new sociotechnical systems has not kept pace. In this paper we analyze the reasons for this disparity, ranging from misaligned methodological incentives, evaluation expectations and research relevance compared to industry. We suggest improvements for the community to consider and evolve so that we can chart the future of our field.

Here we go — these are my favorite comments, both from the reviewing process and out-of-band emails I got. Please share any thoughts or reactions! (I’ve stripped reviewer names and affiliations for privacy reasons.)

  • Is it a rant?: “The paper felt a bit too much like a list of particular criticisms ACs have raised against your papers in the past. It was unclear how principled and complete of an exploration of the problems of  social computing systems research it is. How pervasive are criticisms of exponential growth and snowball sampling, really? Aren’t they just easy stand-ins for ACs to sidestep underlying, thornier problems?”
  • Discussion of industry vs. academia: “[It] was too simplistic. I think a third way that should be explored much more is to what extent academia can partner with either large industry or small startups. See Joel Brandt’s collaboration w/ Adobe, Niki Kittur’s with Wikimedia, etc.”
  • Distinction between spread and steady state: “In a ‘living’ social computing system, there is no simple steady state. To maintain the appearance of continuity, the system itself has to be constantly updated, changed, tweaked to respond to the changing balance and makeup of the user community; to keep up the arms race against spammers, etc. Steady state is an illusion created by the never-ending work of the maintainers of social computing systems.”
  • The 4:1 submission ratio: “There is an implicit claim that the number of papers submitted, or accepted, is roughly equivalent to the impact of a particular type of research. The ratio of “understanding users” to “systems” was 4:1 – so what?  Is this a declining trend or steady state? Most papers end up being read (and cited) infrequently. This may be especially true about papers that study and describe populations in systems with half-lives of 2-3 years. How many study papers that are 10+ years old do you still consider worthwhile? How many systems papers? Is there a real imbalance at that scale?”
  • Snowball sampling disagreement: “I think that, in most cases, this is undesirable, except in cases where the target user demographic is the same as our social networks (e.g., highly educated tech early adopters)”
  • Field study difficulty: There is an unnecessary slam on lab research as being too easy. We need to be more balanced here.
  • Arguments aren’t particularly “controversial”: we’re not taking a stand that’s horribly divisive. (That’s fine with me. I’m OK with just drawing out the issues.)
  • Generalizability: Some reviewers felt that these results could generalize beyond social computing to other areas. Others felt that we should broaden even to traditional CSCW topics like small group collaboration and communication. Many people felt that these arguments resonated even outside of our direct community. I’m honestly less sure of my footing here; I don’t want to overclaim.
  • Stronger argument why academia matters: “The argument could be made stronger for why should social computing systems should have a place at CHI or in academia if they can be done in industry with more access to data and better resources. The authors mention market incentives that can be avoided in academia. However, the majority of researchers have to find funding from the NSF or from industry so there are markets in both cases.”
  • Why do social computing systems matter?: “This submission could be stronger, especially for young PhD researchers, if it clearly outlined what contributions social computing systems research brings to the table. Why is it important that it be done?” “More discussion on the goals and the assessment of quality of social computing research would be extremely helpful.”
  • Qualitative studiers: Don’t forget about Studiers in anthropology, cultural and media studies. “These qualitative studiers often ask for research to a) engage in actual conversations  with users and b) discuss the larger cultural and societal implications of one’s system.”
  • Big Data vs. Industry: “It does not, however speak to the so-called Big Data movement we have seen in Social Computing (and that has been addressed in various forms by myself, Scott Golder, d boyd, and others.  While this is a bit orthogonal, it does address the sampling questions also detailed in the article.”
  • Builder/Studiers too simplistic?: “I think that there’s continually the problem in CHI that it’s a conference of minorities, and it’s a case of 20% builders, 20% studiers, 20% designers, 20% usability people studying Fitts Law until their socks fall off and so on. I’m not sure I agree with their characterization that ‘the prevalence of Studiers in social computing means that Studiers are often the most available reviewers for a systems paper on a social computing topic’. My experience is that whoever I want to review my paper – studiers for a study, builders for a technical system – I’ll end up with someone from the wrong place who can’t understand.
  • Replication: “If replication isn’t highly valued in our community, then one possible outcome is that the expectations for a social computing systems paper become quite high. The paper would have to not only introduce the system, but also provide a solid evaluation of it, because the bias against replication implies that future evaluations aren’t likely to be forthcoming.”