Follow Us:

Email
Blogs Blogs
2014: Predictive Analytics Tools to the Rescue

The big data predictions for 2014 are coming in, and it appears the focus next year will increasingly be on analytics. Throughout 2013, much ink (and many pixels) were dedicated to talking about the dearth of data scientists. Colleges and universities have been scrambling to roll out programs to train people to work in this growing field. But the prospect of trained data scientists emerging from these programs, ready to join the workforce, is still several years out.

However, there is an easier and more immediate solution to the problem at hand. As Roger Barga, group program manager for Microsoft’s Azure Data Platform, noted at the Microsoft Faculty Research Summit in Redmond back in July, a quicker solution to the problem might be developing analytics applications that enable those without high-level data analysis skills to make sense from large streams of unstructured data. Such applications would also enable the highly trained to come up with results more efficiently.

Greg Todd, chief technology officer at predictive analytics platform provider Revolution Analytics said in a recent ZDnet post that analyzing large data sets to understand customer behavior used to be the province of statisticians and management scientists. However, with the emergence of predictive analytics communities, vendor applications are becoming more "visual, easier to use, and geared more towards business analysts vs. the hard-to-find statisticians and management scientists of the world."

Steven Hillion, Chief Product Officer at web analytics platform Alpine Data Labs, also foresees increasing emphasis on collaborative and web-based solutions for data science and advanced analytics. Furthermore, he expects that machines will eventually take over decision-making based on analytics.

Most are in agreement that 2014 will see a burst of new and better analytics tools to join those already being used. And spending less time collecting and analyzing information will have the powerful result of leaving more time for acting on it and understanding it, says Radhika Subramanian, CEO of pattern-discovery provider Emcien

Activity in the big data world is frenetic. Industry verticals are still learning how to leverage its power. Investors are supporting increasing numbers of new startups. Established solutions providers are rushing to improve their products and introduce new ones. And enterprises are struggling with which solutions to use.

The lack of trained data scientists has the potential to slow down progress in the entire ecosystem. However, if predictive analytics platforms and tools can be designed to be used and understood by business analysts, that would certainly help the industry overall.

Hundreds Turn Out for Rock Stars of Big Data; Event Returns to Silicon Valley Next Year

Rock Stars of Big Data at the Computer History Museum on October 29 was a sold-out event, attracting more than 325 attendees, nine speakers, and nine sponsors. Presentations from industry speakers covered the ethics, logistics, and challenges of big data, as well as future opportunities.

"We are very pleased with the turnout at Rock Stars of Big Data, and are already beginning preparation for another Rock Stars event in the Silicon Valley next fall, this one focusing more closely on data analytics," said Chris Jensen, director of marketing and sales at IEEE Computer Society, producer of the event. "We thank everyone, from attendees to speakers and sponsors, for their support and interest in making the event such a tremendous success."

Those interested in attending next year's event can visit http://www.computer.org/Big-Data in coming weeks for information on next year's event.

Participants in the 2013 event came from a variety of sectors, including computing, telecom, retail, and entertainment. Among the companies represented were: Amazon, Bank of America, Broadcom, China Mobile USA, Cisco, Cray, Ericsson, Fujitsu, GE Software, Google, Hewlett-Packard, Hitachi, IBM, Intel, Microsoft, Qualcomm, Siemens, Sony, VMware, and Wells Fargo.

Among the sponsors were Cray Inc., GE Software, AMAX, FairCom, Aerospike, and Mark Logic. 

Featuring big data experts from Google, Netflix, IBM, Kaiser Permanente, GE Software, and other companies, Rock Stars of Big Data was intended to empower attendees to understand the potential for big data in their business, create a big data culture, make big data projects succeed, and use big data analytics to make the right decisions.

How to Build a Data Science Team from Scratch

According to numerous news reports, data scientists are in high demand. Chris Pouliot, Netflix director of algorithms and analytics, knows that only too well. When he started at Netflix five years ago, he was the company’s lone data scientist. Nowadays, he has responsibility for managing and building entire teams.

Who are these data scientists? How do you find them? How do you interview them, and structure them in the organization? These are some of the questions Pouliot answered for attendees of Rock Stars of Big Data at the Computer History Museum Tuesday.

First and foremost, he said, is understanding what a data scientist is, since there are varying ideas on the definition. Data scientist has been called today’s sexiest job—a label that is endlessly repeated. “This is the data scientist’s favorite quote,” said Pouliot. “We like to be called sexy. I’ll take that compliment every day.”

In reality, what people must do to become a data scientist is more about years of hard work than innate sex appeal. Pouliot said the data scientists that Netflix hires typically have a master’s degree or Phd in a quantitative discipline. Those with an undergraduate degree only shouldn’t entirely be ruled out, but they would need to be exceptional.

They should have experience with regression and time-series analysis, as well as hands-on experience in gathering data. “That’s really hard to learn on the job. It takes years and years of school to learn this,” Pouliot said. “We’re really looking for creative data scientists who understand how each component of the algorithm works.”

Experience in gathering data using tools such as SQL, Hive, Pig, or Python is also important, he said, since gathering their own data also helps them manipulate the data and understand potential problems.

Pouliot said it’s not a good idea to build a team with clones of the best data scientist you currently employ. Rather, team members should have diverse backgrounds and approaches. Still, it’s important that they have some common base of knowledge. Team members may come from electrical engineering, statistics and math, or physics and use different tools and processes. Yet it’s important for them to possess a common understanding of programming, deep math, and predictive modeling.

If team members have some commonalities and differences, it enables better creative brainstorming. “When these people brainstorm on a white board, it’s really a beautiful thing,” Pouliot said.

Horizontal data science teams lead to better brainstorming and better career paths for participants. In addition, they make it easier to shift people around to meet demand and manage the team. Vertical teams, meanwhile, provide deep business context. They also tend to produce less friction.

 In terms of hiring data scientists, Pouliot suggests starting with resumes, then drilling down by asking in-depth questions. Frequently, applicants may have worked on interesting projects, but don’t possess a very deep understanding of the technology. Next, it’s a good idea to get then to engage in brainstorming to gauge their ability to think creatively.

For data scientists just graduating from college, Pouliot advises entering contests to prove your abilities or gaining in-depth knowledge on several topics. “The first job’s the hardest to get. You just have to get that first job,” he said.

Big Data: It’s Not a Technology Problem

Teradata Chief Analytics Officer Bill FranksBill Franks, who studied statistics in college, used to be “the guy in the corner who everyone ignored.” He’s now chief analytics officer of Teradata and an acknowledged data analytics expert and author.

“It’s amazing to me how analytics has actually become popular. It used to be I would talk to somebody who would talk to somebody who made the decisions. Now, analytics people are sitting very close to the decision makers and in some case are the decision-makers,” said Franks. “Who knew that doing analytics would come to be known as sexy?”

Franks dispelled some of the hype surrounding big data, such as that big data will replace the need for analysis. Rather, he said, “big data requires big judgment. More data can lead to the need for more judgment, not less.”

In addition, data by itself isn’t of much use. “The real power that I see in big data is not the data, it’s about the information,” said Franks. “It’s in the new information it provides to analytic processes.”

Franks also argued that the technology and tools to handle all aspects of big data are currently available today. He recalled a conversation he had at a conference last year with the networking manager from a major company. The networking manager was talking about challenges he was facing. “He threw out the comment, ‘How could it not be a technology issue when the network can’t handle the data?’”

Franks asked the manager if he could implement new network protocols to solve the problem, and the manager had to admit that he could. “In the absolute sense, it’s not a technology issue,” said Franks. “What he didn’t have was buy-in from top executives. That has nothing to do with the technology.”

Franks said part of the challenge of big data is in choosing which tools to use for what purposes. He advised people not to get into debates and to understand that experts can make almost anything work. “The real question is ‘How can you most efficiently solve the problem you’re faced with’?” he said.

Those working in big data also need to consider what’s legal, what’s ethical, and what customers expect. But Franks acknowledged that this remains a murky area. He recounted a customer worrying that the genetic information he was gathering could make him liable or get him into legal trouble.

“He was afraid to store genetic data because he felt there could be some obligation to report it.  In the long run, I think it’s something a lot of you have to think through—because it’s very dicey—more dicey than you would expect.”

Enabling discovery in an agile environment is another major challenge. “You have to enable discovery in a very agile environment with limited constraints. If we’re going to deploy something, it’s got to be rock solid. You’ve got to have an integrated system that can provide discovery and agility,” Franks said.

Franks compared to the growth in the big data industry to the Industrial Revolution. “A year ago, most of my conversations were about, ‘Should I be hiring people like this’? Now it’s about ‘How do I organize these people, what do I do with them?’”

Franks said starting small is a good idea. “You don’t need every sensor from every automobile for an entire year before you can identify trends,” he said. “Do a prototype and start small.”

He also said companies shouldn’t wait to see how the big data field shakes out. “Don’t think about waiting to get into big data,” he urged, “because it’s only going to get bigger. You’ve got to start getting in front of it now.” 

GE Software’s Bill Ruh Talks about Fundamental Changes Big Data is Bringing to Industry

GE may not be the first brand you think of when it comes to big data. But the industrial stalwart—a producer of turbines, locomotives, and other massive machines—is putting considerable investment into analyzing how big data is changing the way that GE, its customers, and others, operate.

Bill Ruh, vice president of GE Software, described the changes big data is bringing to industry as “the most interesting thing happening to our machines in a long time.”

Previously, data collected by large industrial machines was limited. In addition, much of it was discarded as useless. But the advent of cheap sensors has made possible an explosion of data collection. And industry must now evaluate how to handle it, how much and what types of data to save, and how to develop useful analytics to present data from becoming “a boat anchor.”

Speaking at Rock Stars of Big Data at the Computer History Museum, Ruh said developing quality analytics is the biggest challenge. “The ability to capture this opportunity and do something with it is dependent on your ability to develop useful analytics,” he said.

As little as five years ago, jet engines had as few as two sensors to gather information on key metrics such as average takeoff, cruise, and landing data. Now that number is likely to exceed 20, producing 100 terabytes of data per day from every aircraft. And in the next generation of engines, the number of sensors is expected to experience an order of magnitude increase.

“When you look at the kinds of things you can begin to measure, it’s starting to change things about the way you operate an aircraft. Every product we have is producing more data than we were prepared to consume,” Ruh said.

Understanding the implications of the changes big data is bringing is a work in progress. “We’re being driven by a change that is happening at a rate that we didn’t predict several years ago,” said Ruh.

The right sensors are important, he said, as is understanding the physics of those sensors.

Some interesting startups are emerging to seize the opportunities and tackle the challenges. But there remain many questions to be answered. “We’re not sure how everybody’s going to make money. We’re not sure how customers are going to consume this. We’re trying to learn about what this is going to mean for the industries we serve.”

There is one thing most people agree on, said Ruh: “It does mean that there is going to be transformational change.”

Showing 1 - 5 of 18 results.
Items per Page 5
of 4
Register Now

Twitter

Follow CS_Conferences on Twitter