> ## Documentation Index
> Fetch the complete documentation index at: https://docs.omni.co/llms.txt
> Use this file to discover all available pages before exploring further.

# Tuning Omni AI for cost and quality

> Balance AI cost and quality for your organization by combining ai_settings parameters into profiles that fit your use cases.

export const categoryIcons = {
  'administration': 'lock',
  'api': 'terminal',
  'connections': 'database',
  'dashboards': 'table-columns',
  'embed': 'code',
  'errors': 'exclamation',
  'modeling': 'wrench',
  'patterns': 'plus',
  'schedules & alerts': 'envelope',
  'visualizations': 'chart-column',
  'workbooks': 'book'
};

export const GuideSidebar = ({category, relatedLinks, updatedDate}) => {
  const [progress, setProgress] = React.useState(0);
  React.useEffect(() => {
    const sidebar = document.querySelector('.guide-sidebar');
    if (!sidebar) return;
    let container = sidebar.parentElement;
    while (container && !container.querySelector('.guide-header')) {
      container = container.parentElement;
    }
    if (container && !container.classList.contains('guide-page-layout')) {
      container.classList.add('guide-page-layout');
    }
  }, []);
  React.useEffect(() => {
    const handleScroll = () => {
      const scrollTop = window.scrollY;
      const docHeight = document.documentElement.scrollHeight - window.innerHeight;
      const scrollPercent = docHeight > 0 ? scrollTop / docHeight * 100 : 0;
      setProgress(Math.min(100, Math.max(0, scrollPercent)));
    };
    window.addEventListener('scroll', handleScroll, {
      passive: true
    });
    handleScroll();
    return () => window.removeEventListener('scroll', handleScroll);
  }, []);
  const icon = category ? categoryIcons[category.toLowerCase()] || 'book' : 'book';
  return <aside className="guide-sidebar">
      <div className="guide-sidebar-content">
        <a href="/guides" className="guide-sidebar-back">
          <Icon icon="arrow-left" iconType="solid" size={14} />
          <span>All guides</span>
        </a>

        <div className="guide-sidebar-section">
          <div className="guide-sidebar-label">Progress</div>
          <div className="guide-sidebar-progress">
            <div className="guide-mascot">
              <svg viewBox="0 0 450 450" width="48" height="48">
                <defs>
                  <clipPath id="progressClip">
                    <rect x="0" y={450 - progress * 4.5} width="450" height={progress * 4.5} />
                  </clipPath>
                  <linearGradient id="blobbyGradient" x1="55.9753" y1="0" x2="492.197" y2="169.724" gradientUnits="userSpaceOnUse">
                    <stop stopColor="#BCA2F3" />
                    <stop offset="0.572917" stopColor="#FF7AA2" />
                    <stop offset="1" stopColor="#F3D4A2" />
                  </linearGradient>
                </defs>

                {}
                <circle cx="223.901" cy="223.901" r="213.901" transform="matrix(-0.999988 -0.0049013 0.00491945 -0.999988 447.797 449.992)" fill="#FAFAFA" stroke="#480B38" strokeWidth="20" />

                {}
                <circle cx="223.901" cy="223.901" r="213.901" transform="matrix(-0.999988 -0.0049013 0.00491945 -0.999988 447.797 449.992)" fill="url(#blobbyGradient)" stroke="#480B38" strokeWidth="20" clipPath="url(#progressClip)" />

                {}
                <path d="M310.41 195.084C310.41 200.052 301.362 212.472 284.328 212.472C266.585 212.472 258.246 201.294 258.246 195.912" stroke="#480B38" strokeWidth="17.3883" strokeMiterlimit="1.33344" strokeLinecap="round" />
                <circle cx="21.168" cy="21.168" r="21.168" transform="matrix(-1 0 0 1 388.658 169.001)" fill="#480B38" />
                <circle cx="21.168" cy="21.168" r="21.168" transform="matrix(-1 0 0 1 223.467 169.001)" fill="#480B38" />
              </svg>
            </div>
            <span className="guide-sidebar-progress-text">{Math.round(progress)}%</span>
          </div>
        </div>

        {category && <div className="guide-sidebar-section">
            <div className="guide-sidebar-label">Category</div>
            <div className="guide-sidebar-category">
              <Icon icon={icon} iconType="solid" size={14} />
              <span>{category}</span>
            </div>
          </div>}

        {updatedDate && <div className="guide-sidebar-section">
            <div className="guide-sidebar-label">Last updated</div>
            <div className="guide-sidebar-date">{updatedDate}</div>
          </div>}

        {relatedLinks && relatedLinks.length > 0 && <div className="guide-sidebar-section">
            <div className="guide-sidebar-label">Related</div>
            <ul className="guide-sidebar-links">
              {relatedLinks.map((link, index) => <li key={index}>
                  <a href={link.href}>{link.title}</a>
                </li>)}
            </ul>
          </div>}
      </div>
    </aside>;
};

export const GuideTitle = ({title}) => {
  return <div className="guide-header">
      <h1 className="guide-title">{title}</h1>
    </div>;
};

<GuideSidebar
  categoryIcons={categoryIcons}
  category="ai"
  updatedDate="April 2026"
  relatedLinks={[
{ title: "Improving AI answer quality", href: "/guides/ai/improve-ai-answer-quality" },
{ title: "External AI context", href: "/integrations/ai" },
{ title: "AI settings reference", href: "/modeling/models/ai-settings" },
{ title: "AI credit tracking", href: "/administration/token-tracking" },
]}
/>

<GuideTitle title="Tuning Omni AI for cost and quality" />

The behavior of Omni's Agent on a model is controlled by the [`ai_settings`](/modeling/models/ai-settings) parameter. Several of its sub-parameters compose into a spectrum between "cheap and fast" and "accurate but expensive" — which means you can deliberately tune a model for the workloads it actually serves, rather than accepting defaults everywhere.

This guide maps those knobs to three starting profiles — cost-optimized, balanced, and max-quality — that you can adopt directly or adapt to your organization.

## Requirements

To follow this guide, you'll need:

* **Connection Admin** or **Modeler** permissions on the model you want to work with
* An understanding of the parameters documented on the [`ai_settings` reference](/modeling/models/ai-settings)

## How AI settings affect cost

Every AI turn in Omni costs LLM tokens. Different `ai_settings` parameters affect token usage in different ways, and some affect cost more than others.

<table>
  <colgroup>
    <col style={{ width: '340px' }} />

    <col />
  </colgroup>

  <thead>
    <tr>
      <th>Parameter</th>
      <th>Cost</th>
    </tr>
  </thead>

  <tbody>
    <tr>
      <td style={{ whiteSpace: 'nowrap' }}>[`analyze_configuration.model`][analyze-config]</td>
      <td>**High.** `smartest` is multiple times more expensive per LLM token than `standard`.</td>
    </tr>

    <tr>
      <td style={{ whiteSpace: 'nowrap' }}>[`analyze_configuration.thinking`][analyze-config]</td>
      <td>**High.** Each level adds reasoning LLM tokens to every turn; `high` on `analyze` compounds fastest.</td>
    </tr>

    <tr>
      <td style={{ whiteSpace: 'nowrap' }}>[`validate_analysis`][validate-analysis]</td>
      <td>**High.** Adds additional model turns per analytical turn, each at the analyze tier and thinking level.</td>
    </tr>

    <tr>
      <td style={{ whiteSpace: 'nowrap' }}>[`conversation_prune_length`][conversation-prune]</td>
      <td>**Medium. Claude/Anthropic models only.** Higher thresholds mean more LLM tokens billed per turn on long conversations.</td>
    </tr>

    <tr>
      <td style={{ whiteSpace: 'nowrap' }}>[`query_all_views_and_fields`][query-views-fields]</td>
      <td>**Low–medium.** A larger search surface means more views considered per query, which costs LLM tokens.</td>
    </tr>

    <tr>
      <td style={{ whiteSpace: 'nowrap' }}>[`build_configuration.*`][build-config]</td>
      <td>**Low in aggregate.** Model-building tasks run less often than analysis.</td>
    </tr>

    <tr>
      <td style={{ whiteSpace: 'nowrap' }}>[`simple_summarize_configuration.*`][summarize-config]</td>
      <td>**Low per call, but high volume.** Keep at `fastest` / `none` unless you have a specific reason to upgrade.</td>
    </tr>
  </tbody>
</table>

## Tuning profiles

Each profile below is a recommended starting point — not a rigid recipe. You can mix parameters across profiles and override settings on specific models when individual use cases warrant it. See [Mix and override profiles](#mix-and-override-profiles) below.

| Profile                           | Best for                                                                                                                | Tradeoffs                                                                                           |
| --------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- |
| [Cost-optimized](#cost-optimized) | High-volume AI usage, recurring questions against a well-curated topic set                                              | Smaller query surface, no query validation, earlier context pruning, no extended reasoning          |
| [Balanced](#balanced)             | Most organizations, most of the time. Mirrors Omni's default behavior.                                                  |                                                                                                     |
| [Max quality](#max-quality)       | Highly complex questions and datasets, optimizing for depth over speed, debugging accuracy issues with default settings | Potential for higher costs due to query validation and usage of `smartest` + high `thinking` models |

<Note>
  The [`conversation_prune_length`][conversation-prune] parameter is only supported for Claude/Anthropic models.
</Note>

<h3 id="cost-optimized">
  Cost-optimized
</h3>

This profile is best for high-volume AI usage where per-query cost matters more than marginal accuracy, or models where end users are doing bounded, familiar analysis. For example, recurring questions over a well-curated topic set.

```yaml Cost-optimized ai_settings profile theme={null}
ai_settings:
  query_all_views_and_fields: disabled
  validate_analysis: disabled
  conversation_prune_length: short
  analyze_configuration:
    model: standard
    thinking: none
  build_configuration:
    model: standard
    thinking: none
  simple_summarize_configuration:
    model: fastest
    thinking: none
```

The tradeoffs of this approach are:

* A smaller query surface (topic-scoped only), so ad-hoc views aren't available to the AI ([`query_all_fields_and_views: disabled`][query-views-fields])
* No self-correction before users see results ([`validate_analysis: disabled`][validate-analysis])
* Earlier context pruning on long chats (75,000 LLM tokens vs. 175,000 at the default) ([`conversation_prune_length: short`][conversation-prune])
* No extended reasoning — every turn responds from the raw prompt (`*.thinking: none`)

<h3 id="balanced">
  Balanced (default)
</h3>

This profile is best for most organizations, most of the time. This profile mirrors Omni's out-of-the-box behavior — it's what you get if you don't set `ai_settings` at all. You can copy it into your model to make the intent explicit, or omit `ai_settings` entirely.

```yaml Balanced (default) ai_settings profile theme={null}
ai_settings:
  query_all_views_and_fields: enabled
  validate_analysis: disabled
  conversation_prune_length: max
  analyze_configuration:
    model: standard
    thinking: none
  build_configuration:
    model: smartest
    thinking: none
  simple_summarize_configuration:
    model: fastest
    thinking: none
```

This approach is the default because:

* The `*.model: standard` tier is strong enough for most analytical tasks without `smartest`-tier pricing
* `*.thinking: fastest` is almost always the right call for short summarization work
* [`conversation_prune_length: max`][conversation-prune] preserves context while opting you into future >200k context sizes as they become available

<Note>
  If your instance was created on or before March 5, 2026, [`query_all_views_and_fields`][query-views-fields] defaults to `disabled` rather than `enabled`. Setting it explicitly in your model overrides the instance default either way.

  Similarly, if your instance was created on or before April 23, 2026, [`build_configuration.model`][build-config] defaults to `standard` rather than `smartest`. Setting it explicitly in your model overrides the instance default.
</Note>

<h3 id="max-quality">
  Max quality
</h3>

This profile is best for highly complex questions and datasets, and when you want to optimize for exploration depth and answer quality over speed. It can also be useful when you've seen accuracy issues with the defaults and want to raise the floor before debugging further.

```yaml Max quality ai_settings profile theme={null}
ai_settings:
  query_all_views_and_fields: enabled
  validate_analysis: enabled
  conversation_prune_length: max
  analyze_configuration:
    model: smartest
    thinking: high
  build_configuration:
    model: smartest
    thinking: medium
  simple_summarize_configuration:
    model: standard
    thinking: low
```

While this approach can yield higher quality results, there are a few tradeoffs:

* [`validate_analysis: enabled`][validate-analysis] adds additional model turns per analytical turn
* `smartest` + `thinking: high` on [`analyze_configuration`][analyze-config] is the single biggest cost multiplier — only use it if the accuracy lift is worth it for your workload
* [Summarization][summarize-config] is intentionally kept lighter (`standard` / `low`) because it's high-volume and low-leverage; pushing it to `smartest` / `high` rarely changes subtitle or description output meaningfully

## Mix and override profiles

The profiles above are starting points — you don't have to adopt one in its entirety. A few common compositions:

* **Mostly default with quality insurance.** Start from [**Balanced**](#balanced) and flip [`validate_analysis`][validate-analysis] to `enabled`. This adds self-correction without upgrading model tier or thinking level, which is often the cheapest way to raise answer quality.

  ```yaml theme={null}
  ai_settings:
    validate_analysis: enabled
  ```

* **Max quality analysis, default everything else.** Upgrade only [`analyze_configuration`][analyze-config] and leave build and summarize at defaults. This concentrates spend where it matters most — user-facing analytical output — without paying for it on lower-value tasks.

  ```yaml theme={null}
  ai_settings:
    analyze_configuration:
      model: smartest
      thinking: high
  ```

* **Cost-optimized with fuller context.** Start from [**Cost-optimized**](#cost-optimized) but keep [`conversation_prune_length: long`][conversation-prune] if your users tend to have long chat sessions where losing earlier context would hurt more than the LLM token savings.

  ```yaml theme={null}
  ai_settings:
    query_all_views_and_fields: disabled
    validate_analysis: disabled
    conversation_prune_length: long
    analyze_configuration:
      model: standard
      thinking: none
    build_configuration:
      model: standard
      thinking: none
    simple_summarize_configuration:
      model: fastest
      thinking: none
  ```

<h2 id="apply-profile">
  Apply a profile to your model
</h2>

<Steps>
  <Step noAnchor>
    Navigate to the model IDE and open the model settings (`model`) file.
  </Step>

  <Step noAnchor>
    Add the `ai_settings` block from the profile you want to use to the model file.

    <Tip>
      **Note sure which profile to use?** Start with [**Balanced**](#balanced) and only adjust once you see specific cost or accuracy issues you want to address.
    </Tip>
  </Step>

  <Step noAnchor>
    Save your changes.
  </Step>

  <Step noAnchor>
    Promote the changes to the shared model.
  </Step>
</Steps>

## Monitor and iterate

After applying a profile, check the [**Analytics > Credit tracking** dashboard](/administration/token-tracking) periodically to see how your users are interacting with Omni AI. Look for:

* **Question volume and types** — Helps you judge whether the profile still fits the actual workload
* **Turn counts per conversation** — Long conversations interact with `conversation_prune_length`; if pruning is kicking in frequently, consider raising the threshold or investigating why conversations run long

Tuning is iterative. It's common to start balanced, upgrade a single dimension (such as `validate_analysis`), observe the change, and continue from there rather than jumping straight to max quality.

## Next steps

* [AI credit tracking](/administration/token-tracking) - Track your organization's AI usage over time
* [`ai_settings` reference](/modeling/models/ai-settings) — Full parameter documentation
* [Optimize models for Omni AI](/modeling/develop/ai-optimization) — Curate the context Omni AI sees, which can help improve quality instead of making model-tier upgrades
* [Learn from conversation](/ai/learn-from-conversation) — Use real user questions to improve the model over time

[analyze-config]: /modeling/models/ai-settings#param-analyze-configuration

[build-config]: /modeling/models/ai-settings#param-build-configuration

[summarize-config]: /modeling/models/ai-settings#param-simple-summarize-configuration

[conversation-prune]: /modeling/models/ai-settings#param-conversation-prune-length

[validate-analysis]: /modeling/models/ai-settings#param-validate-analysis

[query-views-fields]: /modeling/models/ai-settings#param-query-all-views-and-fields
