Docs/Aigovernance/Aicost/Optimization

Cost Optimization

Reduce AI costs without sacrificing quality through intelligent optimization strategies.

Optimization Strategies

1. Model Selection

Use the right model for each task:

GPT-4 for complex reasoning
GPT-3.5 for simple tasks
Smaller models for classification

2. Response Caching

Eliminate redundant LLM calls by caching responses for identical inputs. When your OpenAI calls are wrapped with observeOpenAI, every call is traced automatically, so you can measure cache hit rates and the spend you avoid in the dashboard at https://app.agenticants.ai.

typescript

3. Prompt Optimization

Shorter prompts = lower costs:

Remove unnecessary context
Use concise instructions
Optimize system messages

4. Smart Sampling

Don't trace everything. Wrap only the requests you want to observe so you keep ingestion volume (and cost) down:

python

from ants_platform import get_client client = get_client() # configured via ANTS_PLATFORM_PUBLIC_KEY / ANTS_PLATFORM_SECRET_KEY / ANTS_PLATFORM_HOST def handle(request): if should_trace(request): with client.start_as_current_span(name="handle-request") as span: span.update(input=request) result = run_model(request) span.update(output=result) return result # Skip tracing for sampled-out requests return run_model(request)

Configure the client explicitly when not relying on env vars: