The platform uses a layered architecture to separate concerns and keep the system evolvable as model providers change.
Logical Architecture
- Ingress: API Gateway authenticates, rate-limits, and performs schema validation for requests.
- Safety: prompt templates, redactions, safety classifiers, and policy enforcement are applied uniformly.
- Runtime: orchestrates model calls, tool use, and retrieval; logs structured traces with costs and latencies.
- Registry: canonical store for prompts, datasets, eval results, and approvals.
Request Flow
- Client authenticates and submits a typed request to the Gateway.
- Policies and guardrails are applied; requests may be blocked or modified.
- Runtime executes the plan (RAG, tools, model inference), producing a result + trace.
- Observability pipeline ships metrics, traces, and feedback for analysis.
Design Principles
- Prefer explicit configuration and versioning over behind-the-scenes magic.
- Minimize coupling to specific model providers.
- Make safety features opt-out only via review.
Caution
Do not call providers directly from product apps. Use the Gateway to ensure policies, logging, and quotas are consistently applied.