https://cloud.google.com/data-fusion/

Smarter data integration for smarter analytics

Cloud Data Fusion is a fully managed, cloud-native data integration service that helps users efficiently build and manage ETL/ELT data pipelines. With a graphical interface and a broad open source library of preconfigured connectors and transformations, Cloud Data Fusion shifts an organization’s focus away from code and integration to insights and action.

Code-free deployment of data pipelines

Cloud Data Fusion features a visual point-and-click interface that enables the code-free development of ETL pipelines. When combined with its broad library of data transformation blueprints, Cloud Data Fusion empowers a self-service model of data integration that removes expertise-based bottlenecks and accelerates time to insight.

An open core, delivering hybrid and multi-cloud integration

Cloud Data Fusion is built on the open source project CDAP, and this open core ensures data pipeline portability for users. CDAP’s broad integration with on-premises and public cloud platforms gives Cloud Data Fusion users the ability to break down silos and deliver insights that were previously inaccessible.

Get more from Google’s industry-leading big data tools

Cloud Data Fusion’s native integration with Google Cloud simplifies data security and ensures your data is immediately available for analysis. Whether you’re curating a data lake with Cloud Storage and Cloud Dataproc, moving data into BigQuery for data warehousing, or transforming data to land it in a relational store like Cloud Spanner, Cloud Data Fusion’s integration makes development and iteration fast and easy.

Robust data engineering through collaboration and standardization

Cloud Data Fusion offers both preconfigured transformations from an OSS library as well as the ability to create an internal library of custom connections and transformations that can be validated, shared, and reused across an organization. It lays the foundation of collaborative data engineering and improves productivity. That means less waiting for data engineers and, importantly, less sweating about code quality.

Built-in connectors to a variety of modern and legacy systems, code-free transformations, conditionals and pre/post processing, alerting and notifications, and error processing provide a comprehensive data integration experience.

Cloud Data Fusion helps users build scalable, distributed data lakes on GCP by migrating data from siloed on-premises platforms. Customers can leverage the scale of the cloud to centralize data and drive more value out of their data as a result. The self-service capabilities of Cloud Data Fusion increase process visibility and lower the overall cost of operational support.

Many users today want to establish a unified analytics environment across a myriad of expensive, on-premises data marts. Integrating data from all these sources using a wide range of disconnected tools and stop-gap measures creates data quality and security challenges. Cloud Data Fusion’s vast variety of connectors, visual interfaces, and abstractions centered around business logic helps in lowering TCO, promoting self-service and standardization, and reducing repetitive work.