Google’s new cloud service eases data preparation for machine learning

MongoDB adds free tier and migration utility to its cloud service
March 7, 2017
Feds struggle with regulating banking’s use of big data
March 13, 2017

Google’s new cloud service eases data preparation for machine learning

One of the challenges that data scientists face when running machine learning workloads is processing information before it’s ready for use. Google unveiled a new cloud service Thursday aimed at easing that pain.

Google Cloud Dataprep will automatically detect data schemas, joins, and anomalies such as missing or duplicate values, without requiring coding. After that, it will help users build a set of rules for processing the information. Those rules are then built in Apache Streams format and can be imported into products like Google’s Cloud Dataflow for processing information as it’s imported into services like the BigQuery data warehouse service.

While Cloud Dataprep is built to prepare data for machine learning, the system also uses machine learning itself to try to determine which rules will be most useful for customers. As of Thursday, it’s available in private beta.

BigQuery is receiving a number of enhancements as well, including a new Commercial Datasets program that’s now available in public beta. It will let users take information from AccuWeather, Dow Jones, Xignite, HouseCanary, and Remine and directly feed it into BigQuery for further processing.

Leave a Reply

Your email address will not be published.

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Contact Us Today And Get Your Free Quote...