Sunday, October 28, 2012

Success in Natural Language Processing is Human-Level Intelligence

There's a lot of talk about Natural Language Processing, NLP, using computers to deal with lots of text. The current state of the art is like cooking with only a colander and whatever ingredients fall out of the tree in your back yard. People are entirely too excited about a collection of weak-sauce results.

  • Sentiment analysis is a joke on the unpopular kids (brands) desperate to know if people like them. "How many times did they say the words Nike and Love in the same sentence? Huh? How many?" AKA let's-reduce-all-human-discourse-to-one-linear-scale.
  • More general word frequency analysis can be fun, just like the index of an arbitrarily long book. And you hit problems with grammatical changes right away, so you start using some clever stemming approach, and that either makes things better or worse, and the machine is sure as heck not going to know which it is.
  • Co-occurrence? That's the best you got?
  • Okay Google's machine translation is pretty cool, but Chinese Room is not real understanding or analysis.
Take a look at this public service ad on the NYC subway:


Here's what it says:

MTA
.info
What's next?
Poetry is back
in Motion.
Many of you felt parting was not such sweet sorrow.
So we're bringing poetry back in a very artful way.
Hopefully, you'll feel transported.
Improving, non-stop.

When I was in Korea and studying Korean, I tried to read and understand signs that I came across. If nothing else, I should be able to read the signs in the subway, right? Even if you speak English as a first language, if you don't regularly ride the subway in New York, you may not know what this sign is really saying. If you're clever, you can sort of guess, but probably not perfectly.

MTA is the Metropolitan Transportation Authority. This isn't stated in the ad, but humans can probably guess that this public-service-announcement-looking sign on the subway is probably from the subway people.

"What's next?" is a rhetorical question, and in fact a sort of MTA advertising series as they tell us what updates and changes are happening now or in the near future.

"Poetry is back in Motion" refers to the MTA "Poetry in Motion" series, a separate initiative that puts short poems in subway ad slots. Apparently they stopped doing this for a while, and now they're going to be back. Or maybe they've just failed to sell all the ad slots. Who knows.

Next our friends the MTA allude to Juliet's parting words to Romeo from her balcony. Apparently by the end of the play both Poetry in Motion and all subway riders will be dead. But seriously, this allusion just doesn't make sense. Does no one understand the feeling Juliet was conveying? Honestly?

The "feel transported" is actually kind of a fun double-meaning pun. The "non-stop" is less fun but would probably be better if it wasn't following a bunch of other junk just like it.

I suppose you could say that the MTA has really done a noble job in making a dull message a little more fun. No doubt. The point is that really understanding even this fairly simple message is not so easy. What if your NLP doesn't have the NYC-subway-rider plug-in? Or the dual-meaning-pun plug-in? The Shakespeare plug-in? I suspect that until machine text analysis is done by an embodied learning computer with human-equivalent intelligence, we will be limited to the frankly unimpressive kinds of tools that we have so far. To make my suggestion even less helpful, I suspect that as soon as such technologies exist, they will have the same drawbacks that humans do. Perhaps computer users will all have to be managers. Will I have to give my computer the weekend off? Maybe I should have... ramble mode OFF

Saturday, October 27, 2012

metric-driven vs. data-driven


Bitsy Bentley, director of data visualization at GfK Custom Research, gave a good talk on Monday at Pivotal. It was mostly about visualization, but the part that resonated most with me was a related point she made about the difference between being metric-driven vs.data-driven. Here's how she illustrated where most groups currently are and the direction she thinks they need to move:

Inline image 1

Some people in the audience on Monday were confused about the distinction between metrics and data. I think it's an absolutely vital distinction, and one that hits close to home when I think about work that I'm often asked to do. A metric is a particular reduction from some subset of your data. It can be reported, rewarded, punished, used for other decisions... Metrics can certainly have value, but being focused just on some metrics is not the same thing as being data-driven. The stories in data, the real information, they frequently resist reduction to metrics - certainly to the limited collection of metrics you happen to already have. And metrics frequently obscure rather than elucidate. At best they give you a rough what - rarely a useful why or how.

As a hypothetical example, you might feel good watching a metric march in the right direction for a number of years - but if it does move in the wrong direction, that metric doesn't tell you why or what to do about it. If it was evidence of success before, is it evidence of failure now? What if you aren't doing anything differently? To really make decisions based on data requires more than just monitoring metrics. And I don't just mean you need the right metrics rather than the wrong ones - metrics are necessarily reductive, and even if you have the best metric perspectives on your data, they are still perspectives, not the data itself.

One conclusion is that it's often better to plot all of your data, as much as possible, to have a chance at understanding it before you start reducing it to numeric summaries. A corollary might be that we should question metrics that don't show the whole picture. Another conclusion is that we need to spend more time dealing with the data itself in order to understand its nature, to identify which metrics might aid understanding and which effectively stymie it, to determine what is signal and what is noise, and vitally to spend more time looking for new things than we spend recreating old things and then quickly close the loop by acting on new insight (reporting, changing policy, etc.) - and then go back to looking for the next thing.

I think this could be something to think about: are we data-driven, or are we merely metric-driven?

Ponder the divine wisdom of data cat:

Inline image 2