IBM, G.E. and Others Create Big Data Alliance - NYTimes.com

A key element of the big data business is getting what much of computer technology secretly craves: Normality.

On Tuesday, several companies involved in analyzing digital information announced a common set of standards for Hadoop, perhaps the most widespread framework for technology analysis.

The companies, including General Electric, Hortonworks, IBM, Pivotal and Verizon, said they would develop their products and services on a common core of Hadoop’s key components.

Common standards often follow early development of software and hardware. If more companies use the same stuff, it usually helps with things like learning and certification, application development, and new products.

“What we’re seeing is the rise of algorithms in new customer engagement models,” said Paul Maritz, the chief executive of Pivotal, a company that builds software for other companies and offers products for online software development.

[Video: IBM video on Hadoop software. Watch on YouTube.]

Hadoop is a method for distributing, managing and processing very large and often disparate amounts of data. It is open-source software, and comes out of research at Yahoo and Google, among other places.

Those two companies had businesses that involved collecting lots of behavioral online clicks, which made them among the first companies that had to handle big files of so-called unstructured data (as opposed to more conventional data, like payrolls). As more people, companies, and sensors move online, their unstructured data needs have become everyone’s necessity, and Hadoop has flourished.

The technology has been somewhat difficult to use, however, and there are concerns that deepening uses of different kinds of Hadoop, even with slight variations, could slow down the market.

“This is consistent with moving the market along,” said Herb Cunitz, the president of Hortonworks, a major provider of Hadoop technology. “It’s an initiative everyone is welcome to join.”

Standards have historically been a way for big technology companies to gain an edge over the competition by ensuring their knowledge is put to maximum use. Open source was considered a way around that, as well as the slowdowns caused by things like patent disputes.

Hadoop is getting to be big business. Hortonworks went public in December, and currently has a market capitalization just under $1 billion. Cloudera, the largest Hadoop vendor, took a huge funding round last March, including $740 million from Intel for an 18 percent stake.

Cloudera was notably absent from Tuesday’s announcement, which took place at a Pivotal event in San Francisco. Down the road in San Jose, Cloudera was participating in its own big data event.

Pivotal, a company primarily spun out of assets of EMC and VMware in 2013, also announced that it has revenue of over $100 million in 2014. Over $40 million of that, Mr. Maritz said, came from subscription revenue from Pivotal’s big data analysis product.

General Electric has invested in a data analysis platform called Predix. In December it announced a partnership project with Japan’s SoftBank to sell the product in Japan. IBM has been selling Hadoop for several years, but has redoubled efforts as it styles itself as a cloud computing company.

Mr. Maritz said Cloudera was “looking” at the common standard. “They have been invited,” he said.

Mike Olson, co-founder and chief strategy officer of Cloudera, said his company thought the initiative was at minimum redundant. “We believe the Apache Software Foundation is where the discussion should take place,” he said, referring to the open-source group that supports Hadoop and many other open-source projects. “The open-source world is a level playing field.”

http://mobile.nytimes.com/blogs/bits/2015/02/17/ibm-g-e-and-others-create-big-data-alliance/