Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.omni.co/llms.txt

Use this file to discover all available pages before exploring further.

Aggregate awareness lets you optimize query performance by routing queries to pre-aggregated tables in your data warehouse. End users can access data at different levels of granularity, as Omni will pick the most efficient table based on the query’s structure.

Requirements

To follow the steps in this guide, you’ll need Querier, Modeler, or Connection Admin permissions.

How it works

You build pre-aggregated (or rollup) tables in your preferred transformation tool, such as dbt, and register them with a materialized_query parameter on the aggregate’s view. This gives you full control over how the aggregation is defined and refreshed. At query runtime, Omni inspects the requested fields and rewrites the SQL to use the aggregate table when it contains everything the query needs. Because the aggregate is smaller than the base table, the rewritten query runs faster and uses fewer warehouse resources. The rewrite is automatic, meaning end users don’t need to choose which table to query. Omni uses the aggregate:
  • For single-table queries — All queried dimensions and measures are mapped in the aggregate’s materialized_query.fields.
  • For joined queries within a topic — All queried fields from the base view are mapped in the aggregate’s materialized_query.fields, and the aggregate also includes the join keys to the other views. Omni then joins the aggregate to the dimension views instead of the base table.
If the aggregate is missing any required field, including a join key, Omni falls back to the base tables.

Setting up aggregate awareness

Once you’ve identified the query you’d like to optimize:
1

Build an aggregate or rollup table

Build an aggregate or rollup table within Omni or some other transformation layer such as dbt.If you do this outside of Omni and don’t see your table, refresh the schema to have Omni build it.To demonstrate, let’s say you have a table with the following definition and you want to optimize the performance of the daily aggregations:
order_items table
dimensions:
  created_at: {}
  sale_price: {}
  user_id: {}

measures:
  count:
    aggregate_type: count
  sale_price_sum:
    aggregate_type: sum
    sql: ${sale_price}
  user_id_count_distinct:
    aggregate_type: count_distinct
In your transformation layer, you defined the following materialized table, which has pre-aggregated daily metrics:
Pre-aggregated table definition
dimensions:
  date: {}
  total_sale_price: {}
  order_items_count: {}
  user_distinct_count: {}
2

Add a materialized_query parameter

In Omni’s model IDE, add a materialized_query parameter to the optimized view to enable Omni to match the aggregate table to the underlying base views:
Pre-aggregated table daily_sales
materialized_query:
  fields:
    order_items.created_at: date
    order_items.sale_price_sum: total_sale_price
    order_items.count: order_items_count
    order_items.user_id_count_distinct: user_distinct_count
  base_view: order_items
The fields parameter maps each field from the base view to the corresponding column name in the aggregate table. You can also use the array format, though the object format shown here is recommended for clarity.
To apply aggregate awareness across joins in a topic, include the join keys to the other views in fields. See the materialized_query reference for an example.
3

Verify the setup

Verify that aggregate awareness is working by inspecting the generated SQL using Omni’s SQL inspector. The inspector will show whether Omni is querying the aggregate table or the granular table.
Query rewritten to use the aggregate table
-- Query rewritten to use materialized view "daily_sales".
-- The original unoptimized SQL is commented-out below
-- SELECT DATE_TRUNC('DAY', "created_at") AS "order_items.created_at[date]__raw",
--     COALESCE(SUM("sale_price"), 0) AS "order_items.sale_price_sum",
--     TO_CHAR(DATE_TRUNC('DAY', "created_at"), 'YYYY-MM-DD') AS "order_items.created_at[date]"
-- FROM "order_items"
-- GROUP BY 1
-- ORDER BY 1 NULLS FIRST
-- LIMIT 999

SELECT 
    DATE_TRUNC('DAY', "date") AS "order_items.created_at[date]__raw",
    COALESCE(SUM(CAST("total_sale_price" AS DOUBLE PRECISION)), 0) AS "order_items.sale_price_sum",
    TO_CHAR(DATE_TRUNC('DAY', "date"), 'YYYY-MM-DD') AS "order_items.created_at[date]"
FROM (  
  SELECT 
      date,
      total_sale_price,
      order_items_count,
      user_distinct_count
  FROM "daily_sales"
    ) AS "t"
GROUP BY 1
ORDER BY 1 NULLS FIRST
LIMIT 999

Troubleshooting

If you expect Omni to use the aggregated table but it’s not, consider the following:
  • Missing fields: Ensure that all the fields being queried are present in the aggregated table and correctly mapped in the materialized_query parameter.
  • Missing join keys: If your query joins the base view to other tables in a topic, the materialized view must include the join keys to those tables. Without the join keys, Omni cannot optimize queries that involve joins.
  • Incompatible queries: count_distinct over a different level of aggregation than the aggregate table cannot use aggregate awareness, because the count cannot be re-aggregated from the pre-aggregated values.
  • Cache: If the query results are being retrieved from the cache, Omni may not need to rewrite the SQL.
    • Try clearing the cache to test if aggregate awareness is working as expected.
    • You can also test by setting the cache_policy to 0 so that it never uses cache.
If a query cannot be answered entirely by the materialized_query, aggregate awareness is not likely to be used on that query.

Next steps