More on Big Data

I've been following "big data" posts and articles -- as well as pushing the opportunities provided by big data to my MBA students and anyone else who will listen (for example). Today, Erick Schonfeld reported on Google's efforts towards making big data and visualization (three older posts on visualization: 1, 2, 3) available to us all.

The topic of big data intrigues me perhaps because it feels like a new frontier. What used to be the bastion of researchers with the time and money to collate data has now open to the public. We can do interesting analyses -- without taking months or years to put together data that may by then be out of date. The topic also intrigues be because of the process and capabilities that make big data analytics possible. This is a combination of:

  • Instrumentation - People are designing systems with instrumentation -- systems are being designed to informate (Zuboff's term), rather than having it be an afterthought. Example from minerals industry.
  • Access - We've come to expect data to be available to us, and many organizations are supporting this expectation. The U.S. government is working in this direction with economic data, and even requiring some government-supported research to be available to the public for free (see arguments for and against here -- journals are arguing that they need to cover costs of review and publication). W3C talks about "Seamless Integration of Data" in their report "Improving Access to Government through Better Use of the Web."
  • Interaction - We can ask our own questions. Wikirank, Google Trends,, and Twist (trends in Twitter) each let you set up your own comparisons.

But with data comes responsibility. These tools highlight the need to for a solid understanding of how the data is collected, possible biases, how to ask good questions, and the like. Learning to think about data: We have some key foundations: Thomke on enlightened organizational experimentation, Pfeffer & Sutton's Hard Facts, and even Google's own help pages and blogs for their analytics tools. Reading, writing, arithmetic, and analytics?


The Wall Street Journals Science Journal (in honor of the 400th anniversary of Galileo's first observations with the telescope) quotes Dr. Pompea of the National Optical Astronomy Observatory in Tucson, "Science is fundamentally about establishing truth for yourself." "People can make observations, take data and establish for themselves the nature of the universe. They don't have to take it from someone else or read it in a book." Like Galileo, "they can see it."