Skip to main content

Understanding table fan out

To demonstrate why symmetric aggregates are so useful, let’s start with explaining what table fan out means. Fan out occurs when a table containing metrics joins to a dimension table in a way that results in multiple matching rows, leading to inflated results during aggregation. This example uses two tables - customers and orders. Each row in the customers table represents a unique customer:
customer_idnameloyalty_points
1Bloberta Smith100
2Blobzilla Lee200
3Blobina Chen150
And each row in the orders table represents a unique order placed by a customer:
order_idcustomer_idorder_dateamount
10112025-03-25$50
10212025-03-26$30
10322025-03-27$80
10432025-03-28$20
10532025-03-29$40
Because one customer can have many orders, this relationship is one-to-many. The fan out of data occurs when the tables are joined together. The following table demonstrates what a joined table might look like - note that the customer’s loyalty_points are repeated for each order:
customer_idnameloyalty_pointsorder_idamount
1Bloberta Smith100101$50
1Bloberta Smith100102$30
2Blobzilla Lee200103$80
3Blobina Chen150104$20
3Blobina Chen150104$20
3Blobina Chen150105$40

Handling fan out from joins

If you were to sum the total loyalty_points for each customer from the join table, you’d end up overstating Bloberta’s and Blobina’s points by x the number of orders. To prevent incorrect results like these, it’s important to declare a primary_key for tables and, when defining joins, a relationship_type. Before Omni calculates an aggregation on a fanned out table, Omni will check the table’s primary key - the unique key per row - to ensure values aren’t double-counted in the aggregation. If Omni detects potential for fan out, it will prompt you to define primary keys within your views. To handle the example from the previous section, you could define the following in the model:
customers table
customer_id:
  primary_key: true
orders table
order_id:
  primary_key: true
Relationship between customers and orders
- join_from_view: customers
  join_to_view: orders
  join_type: always_left
  relationship_type: one_to_many
  on_sql: ${customers.customer_id} = ${orders.customer_id}

Handling fan out in materialized tables

Let’s say you pushed the logic to join customers and orders down into your transformations layer. As this is now a single materialized table, Omni no longer has the context of the relationship between the underlying models. In this scenario, you can leverage the aggregate_type and the custom_primary_key_sql parameters to tell Omni how to properly aggregate on nested fields:
Symmetric aggregates on a materialized table
dimensions:
  order_id:
    primary_key: true # Defines the table's primary key
  customer_id: {}
  loyalty_points: {}
  amount: {}

measures:
  loyalty_points_sum:
    sql: ${loyalty_points}
    aggregate_type: sum_distinct_on
    custom_primary_key_sql: ${customer_id} # Dedupe on customer_id
  amount_sum:
    sql: ${amount}
    aggregate_type: sum

Handling fan out from unnested columns

While similar to the join scenario, this example performs an unnest join on an array instead of joining two tables together. For example, this could occur when you have an orders table and an array of objects that represent the individual order items. You can leverage the custom_primary_key_sql_for_quick_aggs parameter to handle this scenario:
Symmetric aggregates on an unnested column
dimensions:
  order_id:
    primary_key: true # Defines the table's primary key
  order_items: {} # Repeated field
  order_item_price:
    parent_field: order_items
    nested_on_field: order_items
    custom_primary_key_sql_for_quick_aggs: ${order_item_id}
    sql: price

measures:
  total_price:
    sql: ${order_item_price}
    aggregate_type: sum_distinct_on
    custom_primary_key_sql: ${order_item_id}
Quick aggregates for which a primary key is applicable - such as SUM but not MIN or MAX - based on the order_item_price dimension will be assigned a custom primary key of order_item_id. In this example, that will apply to the total_price measure.