On Big Data and the IoT: Interview with Bill Franks (Part 1)
MAR 18, 2015 06:24 AM
A+ A A-
On Big Data and the IoT: Interview with Bill Franks (Part 1)
Robert ZicariRoberto V. Zicari, editor of ODBMS.org, spoke with Bill Franks, Chief Analytics Officer for Teradata, about data warehouses, Hadoop, the Internet of Things, and Teradata`s perspective on the world of big data. Roberto is Full Professor of Database and Information Systems at Frankfurt University. He was for over 15 years the representative of the OMG in Europe. 
What is Teradata`s perspective on the world of Big Data?
Bill FranksBill Franks: Our perspective has not really changed with regard to big data. The primary mission of Teradata for decades has been helping organizations utilize and analyze large volumes of data to produce insight for business value.
Note that our Teradata database was originally designed exclusively for analytics, then called ‘decision support’ – unlike most other platforms, which were designed for general computing – then later adapted for analytic uses. As a result, the Teradata analytic engine is – and has always been – uniquely architected for large – ‘big data’ – volume and complexity aimed at producing actionable intelligence.
Of course, the amount of data that’s considered ‘big’ and thus a challenge – has changed, and we have a lot of novel data sources in recent times. However, we believe that companies which have always focused on analyzing and acting upon data intelligently can adapt to the new world of big data. After all, big data is just more data and the analysis of big data is still analysis. There are as many similarities as differences from the past.
Teradata has engineered further analytic enhancements over the years to create a diverse portfolio of products, partnerships, and services to allow our customers to continue to get the most from their data assets. The pace of change is very rapid today and we expect that to continue. We believe our strength is in our experience, expertise and our ability to help organizations navigate the changing landscape and continue to derive new, useful insights from their increasingly large and diverse data sources.
Most data warehousing projects consolidate data from different source systems. What is different in the world of big data?
Bill Franks: By definition, if you want to look at two different data sources together, you must either move one set of data to the other or move them both to a third location. If data is truly disparate, you can’t use it effectively. That is what drove data warehousing to prominence. One huge difference between data warehousing practices years back and then today is that previously, all data that was captured in the business world met three criteria almost 100 percent of the time:
1) It was immensely important; given the cost to capture and store it,
2) The data was well structured, and
3) The data was generated by an organization’s internal business processes.
Therefore, it was mostly placed in relational databases or on a mainframe since those technologies easily handle that type of data. Data warehousing solved the problem of many structured data platforms being spread out–by consolidating the sources for analytic purposes into a single structured platform.
What is different with big data is that today, the data often violates all of the rules.
1) Much of it is not important, or has not yet been proven to be important,
2) The data is not structured in the classic fashion at the outset (though most can and must be structured for analytical purposes), and
3) The data is often from sources external to an organization.
As a result, we now have disparate data platforms that each serve different functions. Some focus on one type of data, while others focus on flexibility. However, the downside is that these platforms don’t integrate well and it isn’t as easy to tie everything together. That’s a problem Teradata is working diligently to solve with our Unified Data Architecture–our pioneering version of the visionary Gartner Logical Data Warehouse.
Will data warehouses become obsolete soon and be replaced by Hadoop?
Bill Franks: Absolutely not. A few years ago, that was a common claim. That claim is rarely heard today. In fact, all of the big Hadoop vendors partner with Teradata.
This is because our data warehousing platforms provide some important things Hadoop does not — just as Hadoop provides some things a data warehouse does not. Each platform has its strengths and weaknesses, but when positioned together, additional value is added. Part of the issue is that people mistake policy decisions for technology limitations.
There is no reason you can’t place untested, raw, unclean data of unknown value on a data warehousing platform; it’s the corporate policies that often forbid it. It is true that once data is critical and is leveraged by many applications and business users, you have to keep some control and consistency over it. This is what a data warehouse does for an organization.
But that doesn’t mean you can’t experiment with new sources freely using the technology that supports formal data warehouses.
A colleague of mine mentioned a conversation he had with a Hadoop user. That user was boasting about how he could with a single command change the data type of information on Hadoop, for instance, if it would help him more easily solve his next problem. My friend then asked him what would happen to the prior dozen or two processes that were built expecting the data to be in the original data type format. 
Wouldn’t they all then break? The user had a blank stare for a moment and then realized his error. As you develop more processes, you must implement security, consistency, and controls on the underlying data. This is why data warehousing – as Gartner defines it, is going to be around for a long time.
With the increased need of tools for combining data together, are we going to see a “federated”- Big Data architecture?
Bill Franks: A form of that is exactly what we are pursuing with Teradata’s Unified Data Architecture. Again – we refer to Gartner’s vision of the “Logical Data Warehouse.” What we are doing is putting in place a layer of architecture that connects multiple disparate data stores. 
This architecture includes – and connects – relational databases like Teradata and Oracle, discovery platforms like our Teradata Aster offering, Hadoop, and other platforms such as MongoDB. The idea is that we make information available to users about data throughout the ecosystem, not just the data on the platform they are operating from. So, I see a data dictionary that includes a “table” called “Sensor Feed.”
I can see the data elements available and write analytic logic against those elements. 
However, I don’t need to be aware of whether the data is a database table, or a Hadoop file, or is in MongoDB. Users can simply build analytics instead of worrying about where data resides, how to log on to various systems, and how to move data. We’ll handle that for them.
We are also beginning to push processing across the various platforms to optimize performance. Just like with a ‘table’ versus a ‘view’ in a database, making a process enterprise-ready might require moving data around the architecture permanently. But now, users are free to discover where that is required. And, the technical team behind the curtain can worry about the details just as they do with traditional data warehousing. We are very bullish on our approach and think we are well positioned to maintain our leadership position in the analytics space.
Teradata made several acquisitions lately. How do the tools that Teradata acquired fit the current Teradata Data Architectural Framework?
Bill Franks: I believe this in general was addressed. However, in addition I would point out that we acquired Revelytix in 2014 to obtain Loom: an open platform for discovering, profiling, preparing, and tracking data lineage for data in Hadoop. Likewise, we acquired Hadapt, which created a big data analytic platform natively integrating SQL with Apache Hadoop. Plus our recent RainStor acquisition strengthens Teradata’s enterprise-grade Hadoop solutions and enables organizations to add archival data store capabilities for their entire enterprise, including data stored in OLTP, data warehouses, and applications.
Bill Franks is the Chief Analytics Officer for Teradata, where he provides insight on trends in the analytics and big data space and helps clients understand how Teradata and its analytic partners can support their efforts. His focus is to translate complex analytics into terms that business users can understand and work with organizations to implement their analytics effectively. His work has spanned many industries for companies ranging from Fortune 100 companies to small non-profits. Franks also helps determine Teradata’s strategies in the areas of analytics and big data.
Prof. Roberto V. Zicari is editor of ODBMS.ORG (www.odbms.org), which is designed to meet the fast-growing need for resources focusing on Big Data, Analytical Data Platforms, Scalable Cloud platforms, NewSQL databases, NoSQL datastores, Object databases, Object-relational bindings, and new approaches to concurrency control. Roberto is Full Professor of Database and Information Systems at Frankfurt University. He was for over 15 years the representative of the OMG in Europe. Previously, Roberto served as associate professor at Politecnico di Milano, Italy; Visiting scientist at IBM Almaden Research Center, USA; the University of California at Berkeley, USA; Visiting professor at EPFL in Lausanne, Switzerland; the National University of Mexico City, Mexico; and the Copenhagen Business School, Denmark.
[%= name %]
[%= createDate %]
[%= comment %]
Share this:
Please login to enter a comment:

Computing Now Blogs
Business Intelligence
by Keith Peterson
Cloud Computing
A Cloud Blog: by Irena Bojanova
The Clear Cloud: by STC Cloud Computing
Computing Careers: by Lori Cameron
Display Technologies
Enterprise Solutions
Enterprise Thinking: by Josh Greenbaum
Healthcare Technologies
The Doctor Is In: Dr. Keith W. Vrbicky
Heterogeneous Systems
Hot Topics
NealNotes: by Neal Leavitt
Industry Trends
Internet Of Things
Sensing IoT: by Irena Bojanova