Iraq 2006: a bag of words

How to make sense of Wikileaks data? One way is visual analysis, as we see here, via Jonathan Stray of Associated Press:

Click the image for the high res version.

Stray and Julian Burgess created a visualization using data from December 2006 Iraq Significant Action (SIGACT) reports from Wikileaks. That was the bloodiest month of the war, and the central (blue) point on the visualization represents homicides, i.e. clusters of reports that are “criminal events” and include the word “corpse.” These merge into green “enemy action” reports, and at the inteface we have “civ, killed, shot,” civilians killed in battle. Stray tells how this was done, with some interesting notes, e.g.

…by turning each document into a list of numbers, the order of the words is lost. Once we crunch the text in this way, “the insurgents fired on the civilians” and “the civilians fired on the insurgents” are indistinguishable. Both will appear in the same cluster. This is why a vector of TF-IDF numbers is called a “bag of words” model; it’s as if we cut out all the individual words and put them in a bag, losing their relationships before further processing.

As a result, he warns that “any visualization based on a bag-of-words model cannot show distinctions that depend on word order.” (Much more explanation and detail in Stray’s original post; if you’re interested in data visualization and its relevance to the future of journalism, be sure to read it.)

Thanks to Charles Knickerbocker for pointing out the Stray post.

The impact of “social” on organizations

Austin’s Dachis Group talks about social business design, defined as “the intentional creation of dynamic and socially calibrated systems, process, and culture. The goal: improving value exchange among constituents.” I find the Dachis overview (pdf) interesting, if a bit scattered. David Armistead and I at Social Web Strategies had been having conceptually similar conversations for the last couple of years, looking at the potential culture change associated with social technology and new media (with Craig Clark), the need for business process re-engineering (with Charles Knickerbocker), and the power of value networks. This morning while sitting on my zafu, I had a flash of insight that I quickly wrote down as five thoughts that came to me pretty much at once…

  1. Organizations are already using software internally and have been for some time – email lists, groupware and internal forums, various Sharepoint constructions, aspects of Basecamp, internal wikis and blogs, etc. What’s changed? I think a key difference is high adoption outside work – more and more of the employees of a company or nonprofit are having lifestyle experiences with Facebook Twitter, YouTube, Flickr et al. The way we’re using social media changes as more of us use it (network effect) and our uses become more diverse.
  2. Organizations see knowledge management as storage, basically, but we can see the potential to capture and use knowledge in new and innovative ways, e.g. using multimodal systems (Google Wave, for example) to capture and sort knowledge as it’s created, with annotations and some sense of the creative process stored with its product – knowing more about how knowledge is produced improves our sense of its applicability. (It’s exciting to be a librarian/information specialist these days.)
  3. Organizations will increasingly have to consider the balance of competition and cooperation with internal teams. I’ve seen firsthand how a culture of competition can stifle creativity by creating a disincentive to share knowledge. I’m thinking we’ll see more “coopetition.”
  4. Who are the internal champions within an organization? There will be more interest at the C-level as social technology is better understood and success stories emerge from early adopters. It would be interesting to know what current champions of social media are seeing and what they’re saying. Also – how much of the move toward “social” will come from the bottom up, and how will that flow of new thinking occur?
  5. How does the new world of social business (design) relate to marketing? Operations? Human resources? To what extent to the lines between departments blur? How will the blurring of the lines and potential cross pollination transform business disciplines?

A final thought: all the minds in your organization have a perspective on your business, and each perspective is potentially valuable. How do you capture that value? Do you have a culture that can support a real alignment of minds/perspectives/intentions?