LLM Invoke Chain Pattern — K9-AIF Patterns

LLM Invoke Chain Pattern diagram — LLM Invoke Chain Pattern — structural diagram · click to enlarge

Motivation

Every agent that calls an LLM faces the same concerns: which model, what cost, what latency, is governance satisfied, has the decision been recorded? Solving this inline in every agent creates duplication, inconsistency, and ungoverned inference.

llm_invoke() is the single composition point. It routes through K9ModelRouter, resolves the model via LLMFactory, invokes the LLM, and persists the routing decision to the audit store — in one call. Agents are decoupled from all of it.

Structure

Agent calls llm_invoke(config, InferenceRequest)
ModelRouterFactory.get_router(config) returns cached K9ModelRouter
K9ModelRouter.route(request) scores catalog models and returns RouteDecision
LLMFactory.get(model_alias) returns cached OllamaLLM instance
OllamaLLM.invoke(prompt) calls the model and returns raw response
RoutingStateStore persists RouteDecision with latency and scores
InferenceResponse returned to agent

Key Concepts

llm_invoke() InferenceRequest InferenceResponse ModelRouterFactory K9ModelRouter LLMFactory BaseLLM RouteDecision RoutingStateStore

Used in K9-AIF

Every BaseAgent.execute() in K9-AIF All EOC agents All DoW agents K9ValidationLoopAgent.run_validation() K9CriticActorAgent.generate() and refine()

View implementation on GitHub ↗