Optimizing query performance with aggregate awareness
Aggregate awareness enables you to optimize query performance by intelligently routing queries to pre-aggregated tables in your data warehouse. This allows end users to access data at different levels of granularity without needing to manually specify which table to query. Omni automatically determines the most efficient table to use based on the query’s structure.
Omni understands the level of aggregation required by a query and dynamically selects the appropriate table (granular or aggregated) in the model to retrieve the data.
Bring your own aggregates
You, the user, create the aggregated tables using your preferred transformation tool. This provides flexibility and control over the aggregation process. Omni then leverages these tables for optimized query performance.
Once you’ve identified the query you’d like to optimize, follow these steps to implement aggregate awareness:
Build an aggregate or rollup table within Omni or some other transformation layer such as dbt.If you do this outside of Omni and don’t see your table, refresh the schema to have Omni build it. You can also navigate to Model > Refresh schema in the model IDE to trigger a refresh for specific tables.
From the model IDE, add a materialized_query parameter to the optimized view to enable Omni to match the aggregate table to the underlying views.Note: The fields parameter your materialized query definition should be in the same order as the dimensions in the optimized table (see example below).
To demonstrate, let’s say you have a table with the following definition and you want to optimize the performance of the daily aggregations:Table with aggregation you want to optimize
In your transformation layer, you have defined the following materialized table, which has pre-aggregated the metrics daily:Pre-aggregated table definition
The last step is to add the materialized_query parameter so that Omni knows when to reference the preaggregated table.Implementation of aggregate awareness
Copy
Ask AI
materialized_query: # note that the order of fields below matches the order of the dimensions fields: [ order_items.created_at, order_items.sale_price_sum, order_items.count, order_items.user_id_count_distinct ] base_view: order_items
Verifying aggregate awareness Now, at query runtime, Omni will dynamically swap in the aggregated table for the underlying table if it contains all the necessary fields to execute the query. Because the aggregated table is smaller than the underlying table, queries running against it will be faster and more efficient.You can verify that aggregate awareness is working by inspecting the generated SQL using Omni’s SQL inspector. The SQL inspector will show whether Omni is querying the aggregate table or the granular table.Example of fallback to pre-aggregated view
Copy
Ask AI
-- Query rewritten to use materialized view "daily_sales".-- The original unoptimized SQL is commented-out below-- SELECT DATE_TRUNC('DAY', "created_at") AS "order_items.created_at[date]__raw",-- COALESCE(SUM("sale_price"), 0) AS "order_items.sale_price_sum",-- TO_CHAR(DATE_TRUNC('DAY', "created_at"), 'YYYY-MM-DD') AS "order_items.created_at[date]"-- FROM "order_items"-- GROUP BY 1-- ORDER BY 1 NULLS FIRST-- LIMIT 999SELECT DATE_TRUNC('DAY', "date") AS "order_items.created_at[date]__raw", COALESCE(SUM(CAST("total_sale_price" AS DOUBLE PRECISION)), 0) AS "order_items.sale_price_sum", TO_CHAR(DATE_TRUNC('DAY', "date"), 'YYYY-MM-DD') AS "order_items.created_at[date]"FROM ( SELECT date, total_sale_price, order_items_count, users_distinct_count FROM "daily_orders" ) AS "t"GROUP BY 1ORDER BY 1 NULLS FIRSTLIMIT 999
If you expect Omni to use the aggregated table but it’s not, consider the following:
Missing fields: Ensure that all the fields being queried are present in the aggregated table and correctly mapped in the materialized_query parameter.
Incompatible queries: Certain queries, such as count_distinct over a different level of aggregation than the aggregate table, may not be compatible with aggregate awareness.
Cache: If the query results are being retrieved from the cache, Omni may not need to rewrite the SQL.
Try clearing the cache to test if aggregate awareness is working as expected.
You can also test by setting the cache_policy to 0 so that it never uses cache.
If a query cannot be answered entirely by the materialized_query, aggregate awareness is not likely to be used on that query.