Abstract—The conversation between theory and practice is more subtle than it appears at first glance. The Web extra at http://youtu.be/pd20DQJH36o is a video in which author David Alan Grier expands on his Errant Hashtag column, discussing the interplay between theoretical research and practical business as it applies to big data.
Keywords—Errant Hashtag; big data; industry; academia
Most recently, James and I parted ways over big data, but it wasn't the first time that we had disagreed. We've had fights since we were kids, but they never did any lasting damage. Our interests and outlooks are either too close or not close enough. James went into business and I went into research. Over the years, we have found that those two things are less connected than we thought.
If we accept conventional wisdom, we believe that new ideas begin in universities. These ideas get tested in the rigors of a laboratory, are published in academic journals, and then move forth into the world where they become the foundation for products and services. Should you desire a bit of evidence to support this view, visit the basement of the Smithsonian's National Museum of American History, where you'll find a plexiglass case holding the first Google server, the system that Sergey Brin and Larry Page built in a Stanford University laboratory so many years ago.
However, conventional wisdom isn't as true as we might like it to be. The conversation between theory and practice is more subtle than it appears at first glance. It “proves extraordinarily difficult” wrote Charles Gillispie in 1957, “to trace the course of any significant theoretical concept from abstract formulation to actual use in industrial operations.”
Gillispie's thesis is seductive. Subsequent writers have noted that new products rarely begin in laboratories but start in business or even marketing offices.
Yet, this thesis has its own problems. In computing, many new technologies, such as the basic Google algorithm, began their lives in university environments. A team of more recent scholars noted that while university research appears to have substantial impacts upon industry, this influence is “almost always difficult to separate from other parts of organizational life.”
When James and I were arguing about big data, we were trying to better understand Gillispie's thesis about the impact of research on practical business. Our discussions began, suitably enough, when he was sitting in his office and I was attending a big data conference some miles away. Our messages went back and forth as James took the position that the term “big data” was merely a marketing label for any kind of statistical or data analytic procedure that might be applied to a large dataset. He even resorted to the common joke that “everyone claims they're doing it,” but “no know knows what it is.”
From where I sat, conference-goers had a very different point of view. To them, “big data” referred to a specific set of techniques that could be used to do a kind of statistical inference on datasets that were massive, changing moment by moment, and highly distributed. Big data essentially combines certain elements of computer science with a branch of statistical practice.
James and I had to end our discussion before we found a resolution. He needed to go to work; I needed to go to bed. When I left the meeting hall, I pondered Gillispie's final observations about the value of scientific research to industry. He argued that research was actually an educational force. It categorized phenomena, developed concepts, and gave those concepts names. The “scientific development of an industry,” he concluded “is measured, not by the degree to which new theory is used to change it, but by the extent to which science can explain it theoretically.”
This will be the starting point for my next argument with James, whenever that may be.