dbt Integration
Overview
Omni’s integration with dbt (both dbt Core and Cloud) embraces the idea that models should be layered, and it should be easy to move between the BI layer and dbt.
These are the features the dbt integration supports (and more coming soon!):
- Schema refreshes to sync dbt changes
- Pull dbt context and metadata into Omni
- Author dbt models from Omni queries
- Dynamic schema switching
Steps for Connecting dbt to Omni
- In Omni, navigate to Settings > Connections.
- Select the connection you want to add dbt to.
- Within the selected connection, click on the dbt tab.
- Add your git repository configuration details:
- Git SSH URL can be found from your git provider (make sure to use the SSH one)
- Git branch is typically main or master
- Target schema is the schema that you previously configured dbt to use.
By default, dbt uses this schema as a prefix; for example, if your
warehouse has schemas
dbt
,dbt_team1
,dbt_team2
, then your Target Schema isdbt
.- If you're using dbt Cloud, you can find this under Deployment Environments > Production
Prod
> Settings > Deployment Credentials > Schema - If you're using dbt Core, you can often find this from the profiles.yml file under 'Schema' Note, for BigQuery, 'schema' is called 'dataset' instead.
- If you're using dbt Cloud, you can find this under Deployment Environments > Production
- Hit ‘Submit’ and Omni will provide you a public key.
-
Follow the instructions from dbt to add a deploy key to your git provider. If you want to use Omni’s ‘push to dbt’ integration, make sure you allow write access. You’ve now connected Omni to your dbt repo.
-
Back on the Omni connection page, hit ‘Refresh Schema.’ This will sync metadata from your dbt repository into your Omni model. Most of the information pulled in comes from the
schema.yml
files that live alongside your dbt models. Example dbt schema.yml file:Behind the scenes, we are compiling your dbt code and using the resulting dbt manifest, which includes information about the final schema names used, dependencies between models, and more.
-
dbt model descriptions will be added to the matching Omni view files. These are also exposed in Omni’s field picker
-
If metadata changes in dbt, you’ll need to refresh the schema again in Omni (either from the connection page, or the model) to resync. We will be automating this in the future.
Sync dbt changes with schema refresh
Schema refresh syncs downstream changes from dbt into your Omni model, pulling in field changes, new fields, new tables, etc. This eliminates the need to manually update the Omni model every time something changes in the layer below. To kick off a refresh, select 'Refresh Schema' from the IDE model menu, or use our API endpoint. Learn more about schema refreshes here.
Pull dbt context and metadata into
Note: When column descriptions differ between your dbt instance and your data warehouse's catalog, Omni prioritizes dbt's column descriptions.
Pulling metadata from dbt into Omni makes it easier for analytics engineers, analysts, and business users to share context about what fields and tables mean, ensuring they're using the right things in their queries (and making tracing logic throughout the data stack very easy!). Omni pulls the following information from dbt:
- dbt model and field descriptions are brought into the corresponding Omni view files. Descriptions also become viewable in the workbook field picker
- dbt SQL code is brought into the corresponding Omni view files. This makes it easy to trace logic between the tools
- dbt dependencies are also brought into the corresponding Omni view files
Author dbt models from Omni queries
Omni is great for both building analyses and data models. Every time you construct a query - select columns, aggregate, filter, add joins - you're creating the building blocks of a model. While not every query needs to be pushed down to dbt, when you do create a query where you want to promote the logic into dbt, you can easily do so from within Omni.
- Within a workbook tab, there is an option under the Model menu to ‘Push to dbt’. This will open a new pull request in your dbt git repository with the SQL used to generate the query.
- Once merged in dbt, the new model will show up inside Omni upon refreshing schema from the Omni IDE.
Dynamic schemas
Dynamic schemas allow users to switch the environment that their dbt integration is pointing to within Omni. Users can easily switch between development and production environments to test changes made in the model and how they impact content. Users can...
- Run the content validator while pointing at a dbt development schema to determine the possible impact ahead of merging changes.
- Create new queries or create visualizations against the dbt dev schema to validate the model changes work as expected.
To switch schemas, users must be in branch mode. To switch schemas, modify the Target Schema in the branch.
Getting Started with Dynamic Schemas and dbt Environments
When we hook up the dbt integration, Omni will create at least one omni_dbt schema. If you have schema overrides in place, are using custom schemas, or a generate_schema_name dbt macro, you can likely expect multiple omni_dbt*
variants corresponding to the config schemas that you have set up in dbt.
The intended use of the dbt integration, is that you build all content and modeling in Omni on the omni_dbt schemas, while using the ignored_schemas parameter in the model file to hide the physical schemas that the omni_dbt schemas are referencing. The benefit of building on these omni_dbt schemas is that we can point them to different dbt environments underneath the hood. These should be configured in Omni to mirror dbt (e.g. Production environment with a target name of “prod”, a development environment for each developer schema, etc.). This allows you to point Omni to production tables and schemas when in production, while also easily swapping to a development environment view to see how tables look in that context and how your associated content may be impacted. This can be done in a branch within the model, and will then point the omni_dbt schemas there.
The core difference here is that you build on just one set of tables (with the omni_dbt prefix), rather than building two copies of a dashboard (one on public, the other on a development schema) and you can swap across the two seamlessly within a branch to see how things change and eliminate the duplication of work.
As a result, to use the environment switching you have to be built on something dynamic (the omni_dbt*
schemas) to allow for that swapping under the hood rather than the static schemas in the db.
If you’ve already built a bit of content on your physical production schemas, you can run the dbt migrator. The dbt migrator will copy the existing logic you may have added in Omni from those physical schemas to the omni_dbt schemas, as well as replicate the relationships in the relationship file. Users will still need to update the model file with the corresponding tables, as well as use the content validator to cut over content successfully.
Common Troubleshooting:
I connected to dbt, but I don’t see the omni_dbt schemas / I see the omni_dbt schemas, but they are empty
When in a branch for a given environment, go to Model > Refresh Schema. If the schema folders are still empty, you should go and add a #
to a table in each schema and check again.
I see tables in my dbt schemas, but no dbt metadata
This is typically due to your dbt environment either not having the proper target schema name that aligns to dbt, or the proper target name for the environment.
I have a schema that has some tables from dbt and others from a different source. Why are these non-dbt tables in omni_dbt?
Essentially, since some parts of your schema are from dbt and others are not, we are picking up that dbt metadata exists in there, and then we are creating those omni_dbt*
schemas for all tables within there.
The way that this works, is if there are config level schemas, they get dropped into omni_dbt_<schema_name>
. If they don't have a config level schema, they just go into omni_dbt.
So, in this case we see there is dbt metadata in public, we grab the tables where there's the override and put them in their new omni_dbt_public schema, and then when we go to get the other source tables we notice that their dbt metadata is blank. This makes sense, because they didn't come from dbt and have no concept of it. So, since they still need a home, they get plopped into omni_dbt which is the default.