Tamr, Inc. has announced it is bringing scalable data preparation to Google Cloud Platform via its integration with Google Cloud Dataflow, which was opened for general availability today by Google. Running within Google Cloud Dataflow, the Tamr system will enable business analysts to independently build and publish new datasets across the enterprise. Google Cloud Dataflow is a simple, flexible and powerful service that can be used to perform data processing tasks of any size.

The Tamr integration with Google Cloud Dataflow will give business analysts a quick way to find, gather and format data from across diverse data siloes — located on premise or in the cloud — without much coding or direct IT support. By using Google Cloud Dataflow to publish the resulting datasets, analysts will be able to move and transform even massive datasets efficiently.

“Google Cloud Platform is tailor-made for the kinds of highly distributed big data analytics and applications customers want to deploy today,” said Andy Palmer, Tamr co-founder and CEO. “Tamr with Google Cloud Dataflow breaks the final logjam, by letting business analysts independently build and share new, high-quality datasets. This creates a real multiplier effect for data-driven enterprises.”

“Enterprises are increasingly combining operational data with a nearly limitless pool of external data to derive valuable business insights. Tamr goes straight to the core of solving the difficult problems in this space today, merging proven techniques for understanding enterprise data with the leading-edge capabilities delivered by modern cloud platforms,” said Meredith Knowles, director, Partner Development, for Cloud Technology Partners, a leading cloud professional services firm.

The Tamr solution:

  • Allows analysts to find and access raw and unified data, resolve quality problems, apply standard and custom formats, and enrich the data via “fuzzy” joins and aggregation. A WYSIWYG interface provides interactive visual feedback and push-button publishing to applications such as Google BigQuery and Excel, and enables transparent sharing and collaboration of data.
  • Eliminates the need for analysts to know everything about the source data in order to find and use it. Tamr’s core technology ? machine learning with human guidance ? lets analysts build up this knowledge by “expert sourcing” from the people who know the data. In addition, business analysts can take advantage of existing transformations written by IT to clean the data so they don’t have do this themselves.

“The end result is faster time to value for business analytics and a continuously growing inventory of high-quality, unified data throughout the organization,” said Palmer.

 

This article was originally posted “Tamr Brings Scalable Data Preparation To The Cloud Via Google Cloud Platform” from Cloud Strategy Magazine.