Under Seattle’s Cloud, a Big Data Cluster Grows

It’s a good time to be doing big data in Seattle.

So says Ed Lazowska, the University of Washington computer science professor who played host and tour guide to the region’s big data lineup during the Washington Innovation Summit last week. He points to the region’s strengths in cloud computing, and a steady stream of big data achievers marching forth from the UW.

While Amazon (NASDAQ: AMZN) may be among the most recognizable of these players, the work being done here now is not just about finding a better way to get you to buy exactly what you didn’t know you always wanted. From startups and investors to giants like Microsoft (NASDAQ: MSFT) and Google (NASDAQ: GOOG), the Seattle-area big data cluster is at work on novel solutions to major problems in healthcare, transportation, energy, communication, and yes, commerce, too. They’re also building tools to democratize big data and improve upon Hadoop, an underlying piece of big data computing infrastructure.

Christian Chabot, co-founder and chief executive of Seattle-based Tableau Software, a big data visualization company, has a refreshing take on the opportunities emerging from all this investment. Essentially, big data could put a lot of people to work on meaningful things.

“One of the great tragedies of the modern technology industry is that a majority of the world’s most brilliant and talented people, many of them computer scientists, have spent the last 15 years working on projects that are primarily about getting people to click on more ads, or put more stuff in their shopping cart,” he says. “I just don’t find this very inspiring.”

Thankfully now, Chabot continues, this technology is finding its way into education, science, developing countries, and “every single industry you can name.”

Lazowska noted the work of Eric Horvitz at Microsoft Research toward helping doctors make better discharge decisions by analyzing thousands of factors and predicting which patients are most likely to be readmitted, and the efforts at P4 Medicine—“predictive, preventive, personalized, and participatory”—from the Institute For Systems Biology. There’s also INRIX, the traffic information provider; 3TIER, which parses loads of weather and climate data for the renewable energy industry; Decide, co-founded by UW computer science professor Oren Etzioni, predicts consumer electronics pricing; and many other companies that have been built to sift through, and make some sense out of, big sets of data.

“It’s all driven by big data, and it’s all tied to the cloud,” Lazowska says. “And the Seattle area is really kind of the owner of the cloud.”

Opportunity for innovators and entrepreneurs abounds.

Bellevue, WA-based venture capital firm Ignition Partners has investments in big data companies including Splunk, Cloudera, Continuuity, and Couchbase. “And, we will be investing in more big data companies in the future,” founding partner Cameron Myhrvold says.

Myhrvold says Splunk, which helps companies capture and make sense of machine-generated data in the datacenter, has about 20 people in its Seattle office, which was opened to tap engineering expertise, particularly in developer tools and application program interfaces (APIs).

There’s a “critical cluster” forming in Seattle, says Ruben Ortega, who spent nine years at Amazon before moving to Google where he’s an engineering director. Companies small and large benefit from a pool of talented people who are “familiar with the nouns and the verbs” of big data. This growing pool of math and data-driven people realizes “that a petabyte is a relatively small amount of information when you’re computing against the exabytes,” Ortega says.

But the talent pool is only so large. Tableau has more than doubled its headcount in 2012 to 720, and as it continues hiring, has confronted “a massive, worldwide shortage of STEM talent,” Chabot says.

Chabot and Tableau’s other co-founders are Stanford grads. They saw no drawback to moving their company to the Pacific Northwest from Silicon Valley. While it was a personal decision at first, Chabot says basing Tableau in Seattle was “one of the best things for the business.”

Now, however, Tableau must look beyond the mountains, coffee, and salmon. “We have been forced, like any company before us that goes through a high-growth phase, to open offices in other locations if for no other reason to tap into other pools of talent,” Chabot says. Tableau has opened offices in Menlo Park, CA; Austin, TX; London; and, Singapore.

UW helps refresh the Seattle pool.

About three quarters of the 30-person Decide team have a UW background, says Decide chief executive Mike Fridgen.

Lazowska notes that UW alumni have had a hand in two foundational big data systems: Hadoop, the open-source system for managing distributed computation across thousands of machines, and MapReduce, the Google system on which Hadoop is based.

There is wide acknowledgement that Hadoop needs improvement.

Google’s Ortega says that although it’s still “the workhorse within the company, even we find that difficult to use.”

Chabot took Hadoop to task for being “highly specialized,” and “understood by a small priesthood of people.” “As long as the big data revolution looks like that, I can tell you it’s not going to go very far,” he says.

Myhrvold says the difficulty of Hadoop creates an opening for big data startups to help companies use it. Ignition invested in Cloudera, one of several attacking this problem. (The company was co-founded by UW computer science alum and Gig Harbor, WA, product Christophe Bisciglia.)

Myhrvold said he sees the big data story unfolding in much the same way as previous technology trends. The movement is starting with core infrastructure companies, and is being followed by big-data applications providers. “There will be a phase after that that will focus on monitoring and performance management, systems management,” he says.

And Myhrvold added, the work is really just beginning.

“I think big data is going to give us at least a 15-year investment horizon,” Myhrvold says. “We’re probably in year three today.”

Photo by JD Hancock via Flickr.

[Note: This story was updated Dec. 4 at 8:39 p.m. PT to correct the spelling of Eric Horvitz’s surname.]

Trending on Xconomy