Custom Middleware
This document introduces the Xpert Agent middleware capabilities and its plugin-based implementation approach, comparable to LangChain Middleware (refer to official documentation: overview, built-in). Through middleware, you can insert cross-cutting logic (logging, caching, rate limiting, security, prompt governance, etc.) at key nodes in the Agent lifecycle without modifying core orchestration logic.
Core Conceptsโ
- Strategy / Provider: Each middleware implements an
IAgentMiddlewareStrategy, marked with@AgentMiddlewareStrategy('<provider>')and registered toAgentMiddlewareRegistry(plugin-sdk). - Middleware Instance: The
AgentMiddlewareobject returned bycreateMiddleware, containing state Schema, context Schema, optional tools, and lifecycle hooks. - Node-based Integration: Connect
WorkflowNodeTypeEnum.MIDDLEWAREnodes to Agents in the workflow graph. At runtime, they are loaded and executed in connection order viagetAgentMiddlewares. - Configuration & Metadata:
TAgentMiddlewareMetadescribes name, i18n labels, icon, description, and configuration Schema. The UI renders configuration panels accordingly.
Lifecycle Hooks & Capabilitiesโ
| Hook | Trigger Timing | Typical Use Cases |
|---|---|---|
beforeAgent | Before Agent starts, triggered once | Initialize state, inject system prompts, fetch external context |
beforeModel | Before each model call | Dynamically assemble messages/tools, truncate or compress context |
afterModel | After model returns, before tool execution | Adjust tool call parameters, record logs/metrics |
afterAgent | After Agent completes | Persist results, cleanup resources |
wrapModelCall | Wraps model invocation | Custom retry, caching, prompt protection, model switching |
wrapToolCall | Wraps tool invocation | Authentication, rate limiting, result post-processing, return Command for flow control |
Additionally, you can declare:
stateSchema: Persistable middleware state (Zod object/optional/with defaults).contextSchema: Runtime-only readable context, not persisted.tools: Dynamically injectedDynamicStructuredToollist.JumpToTarget: ReturnjumpToin hooks to control jumps (e.g.,model,tools,end).
Writing a Middlewareโ
1) Define Strategy & Metadata
@Injectable()
@AgentMiddlewareStrategy('rateLimitMiddleware')
export class RateLimitMiddleware implements IAgentMiddlewareStrategy<RateLimitOptions> {
readonly meta: TAgentMiddlewareMeta = {
name: 'rateLimitMiddleware',
label: { en_US: 'Rate Limit Middleware', zh_Hans: '้ๆตไธญ้ดไปถ' },
description: { en_US: 'Protect LLM calls with quotas', zh_Hans: 'ไธบๆจกๅ่ฐ็จๅขๅ ้
้ขไฟๆค' },
configSchema: { /* JSON Schema for frontend form rendering */ }
};
2) Implement createMiddleware
createMiddleware(options: RateLimitOptions, ctx: IAgentMiddlewareContext): AgentMiddleware {
const quota = options.quota ?? 100;
return {
name: 'rateLimitMiddleware',
stateSchema: z.object({ used: z.number().default(0) }),
beforeModel: async (state, runtime) => {
if ((state.used ?? 0) >= quota) {
return { jumpTo: 'end', messages: state.messages };
}
return { used: (state.used ?? 0) + 1 };
},
wrapToolCall: async (request, handler) => handler(request),
};
}
}
3) Register as Plugin Module
@XpertServerPlugin({
imports: [CqrsModule],
providers: [RateLimitMiddleware],
})
export class MyAgentMiddlewareModule {}
4) Runtime Integration
- During development/deployment, add the plugin to plugin environment variables.
- In the workflow editor, add middleware nodes to the Agent and configure order (execution order can be adjusted by arrangement).
Built-in Examplesโ
- SummarizationMiddleware
Detects conversation length inbeforeModel, triggers compression and replaces historical messages; supports triggering by token/message count or context ratio, and records compression to execution trace viaWrapWorkflowNodeExecutionCommand. - todoListMiddleware
Injects
write_todostool, and appends system prompts inwrapModelCallto guide LLM in planning complex tasks. Tool returnsCommandto update Agent state.
Best Practicesโ
- Implement only necessary hooks, keep them idempotent, avoid heavy blocking operations within hooks.
- Use
stateSchemato strictly declare persistent data, preventing state interference between different middleware/Agents. - In
wrapModelCall/wrapToolCall, prioritize calling the passedhandlerto ensure default chain works, then add custom logic. - For ratio-triggered logic, rely on model
profile.maxInputTokens; fallback to absolute token limits when unavailable (see Summarization example). - Following LangChain Middleware approach: decompose cross-cutting concerns like logging, auditing, caching, rate limiting, and gradual model switching into independent middleware, composing them by connection order.
Through this approach, you can seamlessly migrate LangChain's Middleware model into Xpert's plugin system, reusing existing orchestration, workflow, and UI configuration capabilities.