Visualizing crowd sourcing data

At the closing keynote of the recent Strata Summit in New York, O’Reilly Media founder Tim O’Reilly showed a representation of crowd sourced data on Wikipedia of the 2011 Tōhoku earthquake and tsunami , showing a before and after picture of the page. While interesting, it did not represent what could be shown with the data.

Using the Wikipedia API’s, some features of my VisMarks startup I was able to create a better representation showing an animation of the article over time. While this Wikipedia Earthquake Animation (on a different page for loading) shows a representation of the first 1,000 revisions it highlights one cool way visualize crowd sourced data. Pay particular attention to the new language articles introduced, the images and table of contents as different types of data being added.

While the likes of Twitter and Facebook can provide a stream of information on an emerging event, Wikipedia is unique in that individuals contribute to a single source of combined information. This removes all the noise of duplication. It does not remove the CRUD, however as seen in this article this is quickly removed by others in the community.

It is also cool to see the size of the article grow over time. Below is a graph of the first 24 hours.

These are simple examples of using public API’s and simple tools (in this case, imagemagick,gnuplot and some shell scripts) to tell a story with data visualization.

Tagged with: Data Visualization

Related Posts

AWS CLI support for Aurora DSQL and S3 Tables

If you were following the AWS Re:invent keynote yesterday there were several data specific announcements including Aurora DSQL and S3 Tables . Wanting to check them out, I downloaded the latest AWS CLI 2.

Read more

Migrating off of WordPress - A Simplified Stack

The ongoing drama between Wordpress v WP Engine continues to cross my reading list, but I have permanently removed WordPress from my website. I have finally transitioned away from the complex Linux/Apache/MySQL/PHP (LAMP) stack required for self-hosting WordPress on my professional website.

Read more

WeSQL Introduction – MySQL running on S3

I recently became aware of WeSQL . A MySQL-compatible database that separates compute and storage, using S3 as the storage layer. The product uses a columnar format by default which is significantly more space-efficient than InnoDB.

Read more