The Rise of Big Data... From Moneyball to the Classroom and Back Again
This is the first of a three-part series of posts discussing the importance of data analysis and introducing readers to the importance and value in statistical and econometric analysis.Last Sunday, Steve Lohr wrote a great piece in the New York Times explaining the importance of “big data” in today’s society. He explained that increasingly “businesses make sense of an explosion of data- Web traffic and social network comments, as well as software and sensors that monitor shipments, suppliers and customers- to guide decisions, trim costs and lift sales.”The explosion of data Lohr is referring to is accessible to any field- not just a profit maximizing business- and when used properly, that data can be used to enhance and enrich any institution. Clearly businesses utilize data to inform their decisions, but increasingly, political campaigns, public health officials and advertising agencies are innovating their traditional practices by developing methods and metrics based on data analysis.The most glorified example is illustrated in the book and recent film “Moneyball,” written by Michael Lewis describing the revolution in baseball by Billy Beane and the Oakland Athletics. The short story is that the team began to analyze players using complex statistical analyses instead of traditional benchmarks. Billy Beane is not the only front office executive to develop and exploit new statistical methods; the general manager of the Houston Rockets, Daryl Morey wrote a piece for Grantland.com regarding the “stats movement in sports” and how the success of Moneyball has transcended sports and become impacted countless industries.Morey briefly describes how statistical analyses have entered the realm of education: the Gates foundation is gathering data to evaluate teachers. But Morey and the Gates foundation are only scratching the surface. Education at all levels is ripe for a takeover of objective data analyses. Statistics currently used within schools to evaluate programs or students rely on static data. Static data consists of the most basic statistics we remember from high school: averages and percents. New data- big data- is about how information, records, numbers move over time and how a fact or figure can be broken down to find relationships and meaning behind the numbers.Consider a static piece of data such as: In a specific district, 28% of parents are unhappy about their child’s school. This does not tell an administrator much- probably only something that she already knows. But, a deeper look into the data could reveal important information such as “of the 28% of parents who are unhappy, 70% of their students play a varsity sport.” This has more value; specifically, there is a trend among unhappy parents.The next two posts will dig deeper into the “why” and “how” of how data can be used to improve decision making. At the most basic level, data can help shape expectations, specifically conditional expectations given some sort of observed trend in the data. I will explain the concept of conditional expectation in part two.