> ## Documentation Index > Fetch the complete documentation index at: https://docs.omni.co/llms.txt > Use this file to discover all available pages before exploring further. # Tuning Omni AI for cost and quality > Balance AI cost and quality for your organization by combining ai_settings parameters into profiles that fit your use cases. export const categoryIcons = { 'administration': 'lock', 'api': 'terminal', 'connections': 'database', 'dashboards': 'table-columns', 'embed': 'code', 'errors': 'exclamation', 'modeling': 'wrench', 'patterns': 'plus', 'schedules & alerts': 'envelope', 'visualizations': 'chart-column', 'workbooks': 'book' }; export const GuideSidebar = ({category, relatedLinks, updatedDate}) => { const [progress, setProgress] = React.useState(0); React.useEffect(() => { const sidebar = document.querySelector('.guide-sidebar'); if (!sidebar) return; let container = sidebar.parentElement; while (container && !container.querySelector('.guide-header')) { container = container.parentElement; } if (container && !container.classList.contains('guide-page-layout')) { container.classList.add('guide-page-layout'); } }, []); React.useEffect(() => { const handleScroll = () => { const scrollTop = window.scrollY; const docHeight = document.documentElement.scrollHeight - window.innerHeight; const scrollPercent = docHeight > 0 ? scrollTop / docHeight * 100 : 0; setProgress(Math.min(100, Math.max(0, scrollPercent))); }; window.addEventListener('scroll', handleScroll, { passive: true }); handleScroll(); return () => window.removeEventListener('scroll', handleScroll); }, []); const icon = category ? categoryIcons[category.toLowerCase()] || 'book' : 'book'; return ; }; export const GuideTitle = ({title}) => { return

{title}

; }; The behavior of Omni's Agent on a model is controlled by the [`ai_settings`](/modeling/models/ai-settings) parameter. Several of its sub-parameters compose into a spectrum between "cheap and fast" and "accurate but expensive" — which means you can deliberately tune a model for the workloads it actually serves, rather than accepting defaults everywhere. This guide maps those knobs to three starting profiles — cost-optimized, balanced, and max-quality — that you can adopt directly or adapt to your organization. ## Requirements To follow this guide, you'll need: * **Connection Admin** or **Modeler** permissions on the model you want to work with * An understanding of the parameters documented on the [`ai_settings` reference](/modeling/models/ai-settings) ## How AI settings affect cost Every AI turn in Omni costs LLM tokens. Different `ai_settings` parameters affect token usage in different ways, and some affect cost more than others.

Parameter	Cost
[`analyze_configuration.model`][analyze-config]	High. `smartest` is multiple times more expensive per LLM token than `standard`.
[`analyze_configuration.thinking`][analyze-config]	High. Each level adds reasoning LLM tokens to every turn; `high` on `analyze` compounds fastest.
[`validate_analysis`][validate-analysis]	High. Adds additional model turns per analytical turn, each at the analyze tier and thinking level.
[`conversation_prune_length`][conversation-prune]	Medium. Claude/Anthropic models only. Higher thresholds mean more LLM tokens billed per turn on long conversations.
[`query_all_views_and_fields`][query-views-fields]	Low–medium. A larger search surface means more views considered per query, which costs LLM tokens.
[`build_configuration.*`][build-config]	Low in aggregate. Model-building tasks run less often than analysis.
[`simple_summarize_configuration.*`][summarize-config]	Low per call, but high volume. Keep at `fastest` / `none` unless you have a specific reason to upgrade.

## Tuning profiles Each profile below is a recommended starting point — not a rigid recipe. You can mix parameters across profiles and override settings on specific models when individual use cases warrant it. See [Mix and override profiles](#mix-and-override-profiles) below. | Profile | Best for | Tradeoffs | | --------------------------------- | ----------------------------------------------------------------------------------------------------------------------- | --------------------------------------------------------------------------------------------------- | | [Cost-optimized](#cost-optimized) | High-volume AI usage, recurring questions against a well-curated topic set | Smaller query surface, no query validation, earlier context pruning, no extended reasoning | | [Balanced](#balanced) | Most organizations, most of the time. Mirrors Omni's default behavior. | | | [Max quality](#max-quality) | Highly complex questions and datasets, optimizing for depth over speed, debugging accuracy issues with default settings | Potential for higher costs due to query validation and usage of `smartest` + high `thinking` models | The [`conversation_prune_length`][conversation-prune] parameter is only supported for Claude/Anthropic models.

Cost-optimized

This profile is best for high-volume AI usage where per-query cost matters more than marginal accuracy, or models where end users are doing bounded, familiar analysis. For example, recurring questions over a well-curated topic set. ```yaml Cost-optimized ai_settings profile theme={null} ai_settings: query_all_views_and_fields: disabled validate_analysis: disabled conversation_prune_length: short analyze_configuration: model: standard thinking: none build_configuration: model: standard thinking: none simple_summarize_configuration: model: fastest thinking: none ``` The tradeoffs of this approach are: * A smaller query surface (topic-scoped only), so ad-hoc views aren't available to the AI ([`query_all_fields_and_views: disabled`][query-views-fields]) * No self-correction before users see results ([`validate_analysis: disabled`][validate-analysis]) * Earlier context pruning on long chats (75,000 LLM tokens vs. 175,000 at the default) ([`conversation_prune_length: short`][conversation-prune]) * No extended reasoning — every turn responds from the raw prompt (`*.thinking: none`)

Balanced (default)

This profile is best for most organizations, most of the time. This profile mirrors Omni's out-of-the-box behavior — it's what you get if you don't set `ai_settings` at all. You can copy it into your model to make the intent explicit, or omit `ai_settings` entirely. ```yaml Balanced (default) ai_settings profile theme={null} ai_settings: query_all_views_and_fields: enabled validate_analysis: disabled conversation_prune_length: max analyze_configuration: model: standard thinking: none build_configuration: model: smartest thinking: none simple_summarize_configuration: model: fastest thinking: none ``` This approach is the default because: * The `*.model: standard` tier is strong enough for most analytical tasks without `smartest`-tier pricing * `*.thinking: fastest` is almost always the right call for short summarization work * [`conversation_prune_length: max`][conversation-prune] preserves context while opting you into future >200k context sizes as they become available If your instance was created on or before March 5, 2026, [`query_all_views_and_fields`][query-views-fields] defaults to `disabled` rather than `enabled`. Setting it explicitly in your model overrides the instance default either way. Similarly, if your instance was created on or before April 23, 2026, [`build_configuration.model`][build-config] defaults to `standard` rather than `smartest`. Setting it explicitly in your model overrides the instance default.

Max quality

This profile is best for highly complex questions and datasets, and when you want to optimize for exploration depth and answer quality over speed. It can also be useful when you've seen accuracy issues with the defaults and want to raise the floor before debugging further. ```yaml Max quality ai_settings profile theme={null} ai_settings: query_all_views_and_fields: enabled validate_analysis: enabled conversation_prune_length: max analyze_configuration: model: smartest thinking: high build_configuration: model: smartest thinking: medium simple_summarize_configuration: model: standard thinking: low ``` While this approach can yield higher quality results, there are a few tradeoffs: * [`validate_analysis: enabled`][validate-analysis] adds additional model turns per analytical turn * `smartest` + `thinking: high` on [`analyze_configuration`][analyze-config] is the single biggest cost multiplier — only use it if the accuracy lift is worth it for your workload * [Summarization][summarize-config] is intentionally kept lighter (`standard` / `low`) because it's high-volume and low-leverage; pushing it to `smartest` / `high` rarely changes subtitle or description output meaningfully ## Mix and override profiles The profiles above are starting points — you don't have to adopt one in its entirety. A few common compositions: * **Mostly default with quality insurance.** Start from [**Balanced**](#balanced) and flip [`validate_analysis`][validate-analysis] to `enabled`. This adds self-correction without upgrading model tier or thinking level, which is often the cheapest way to raise answer quality. ```yaml theme={null} ai_settings: validate_analysis: enabled ``` * **Max quality analysis, default everything else.** Upgrade only [`analyze_configuration`][analyze-config] and leave build and summarize at defaults. This concentrates spend where it matters most — user-facing analytical output — without paying for it on lower-value tasks. ```yaml theme={null} ai_settings: analyze_configuration: model: smartest thinking: high ``` * **Cost-optimized with fuller context.** Start from [**Cost-optimized**](#cost-optimized) but keep [`conversation_prune_length: long`][conversation-prune] if your users tend to have long chat sessions where losing earlier context would hurt more than the LLM token savings. ```yaml theme={null} ai_settings: query_all_views_and_fields: disabled validate_analysis: disabled conversation_prune_length: long analyze_configuration: model: standard thinking: none build_configuration: model: standard thinking: none simple_summarize_configuration: model: fastest thinking: none ```

Apply a profile to your model

Navigate to the model IDE and open the model settings (`model`) file. Add the `ai_settings` block from the profile you want to use to the model file. **Note sure which profile to use?** Start with [**Balanced**](#balanced) and only adjust once you see specific cost or accuracy issues you want to address. Save your changes. Promote the changes to the shared model. ## Monitor and iterate After applying a profile, check the [**Analytics > Credit tracking** dashboard](/ai/settings/usage) periodically to see how your users are interacting with Omni AI. Look for: * **Question volume and types** — Helps you judge whether the profile still fits the actual workload * **Turn counts per conversation** — Long conversations interact with `conversation_prune_length`; if pruning is kicking in frequently, consider raising the threshold or investigating why conversations run long Tuning is iterative. It's common to start balanced, upgrade a single dimension (such as `validate_analysis`), observe the change, and continue from there rather than jumping straight to max quality. ## Next steps * [AI credit tracking](/ai/settings/usage) - Track your organization's AI usage over time * [`ai_settings` reference](/modeling/models/ai-settings) — Full parameter documentation * [Optimize models for Omni AI](/modeling/develop/ai-optimization) — Curate the context Omni AI sees, which can help improve quality instead of making model-tier upgrades * [Learn from conversation](/ai/learn-from-conversation) — Use real user questions to improve the model over time [analyze-config]: /modeling/models/ai-settings#param-analyze-configuration [build-config]: /modeling/models/ai-settings#param-build-configuration [summarize-config]: /modeling/models/ai-settings#param-simple-summarize-configuration [conversation-prune]: /modeling/models/ai-settings#param-conversation-prune-length [validate-analysis]: /modeling/models/ai-settings#param-validate-analysis [query-views-fields]: /modeling/models/ai-settings#param-query-all-views-and-fields