Overview
Dataform is a fully managed, serverless orchestration tool within Google Cloud for building and operationalizing SQL-based data transformation pipelines in BigQuery. It simplifies data processing architecture by providing a single environment for data analysts and engineers to collaborate using software development best practices like version control, testing, and documentation. Dataform's open-source, SQL-based language allows users to define tables, configure dependencies, add column descriptions, and implement data quality assertions. It abstracts away the complexity of building SQL pipelines, allowing data analysts to manage dependencies, configure tests, and orchestrate complex pipelines using SQL. It integrates seamlessly with GitHub, GitLab, Cloud Composer, Workflows and BigQuery Studio.