Combine Looker and Dataform for the complete analytics solution
Looker has revolutionised the way data teams operate: rather than creating one-off dashboards, data teams use LookML to define the relationships between datasets, enabling business teams to explore data directly. Dataform helps teams prepare the datasets that form the foundation of a successful Looker implementation.
Looker PDTs are a simple tool for transforming data. But for a growing team, and an organization with complex data needs, PDTs are missing some key features: data tests, fine-grained scheduling, integrated SQL developer tools and incremental table builds. Dataform is the data modeling tool built for enterprise scale.
With Dataform you can write queries in SQL IDE, preview and inspect the results, and validate your SQL queries, all from a rich web interface. Queries are automatically deployed to BigQuery, creating tables and views that can be accessed in LookML.
After defining your transformations in Dataform, add a few extra lines of code to ensure your datasets are tested by Dataform’s automated CI/CD platform. Errors will be picked up before being released to your production environment; incorrect data making its way to Looker reports will be a thing of the past.
Being able to produce analytics tables that we are confident in the output of (because of assertions) and are as up to date as we need them to be (because of scheduling) makes our lives really easy. The UI is incredibly easy and intuitive to use, meaning we spend little of our time setting these things up, and most of our time writing SQL!
I love the dependency tree in Dataform. For me this is a central place for sanity checking my data flows, understanding if I'm reimplementing a dataset which already exists, and verifying logic. Secondly, I love SQLX for generating SQL of a similar structure again and again, it really speeds up development and let's your abstract away logic.
Having modeled data using other tools in the past, this is much simpler and an easier environment to code in. The code compiles in real time and lets you know if there are errors in the syntax. It also helps generate a dependency graph for the data pipeline which is insanely useful.