OpenEnv Environment
IT Helpdesk Ticket Routing OpenEnv
it_helpdesk_ticket_routing
Queue decisions that actually carry forward.
A sleek benchmark surface for sequential helpdesk routing: hidden context, cluster-aware follow-ons, incident handling, deferrals, and a terminal rubric that rewards queue strategy instead of isolated classification alone.
Task Ladder
One benchmark family, not three disconnected demos
The difficulty ladder keeps the same full-routing output while progressively changing observability, queue dependencies, and operational pressure.
Guided Full Routing
Perform full helpdesk routing by selecting issue type, priority, assignment group, and resolution action. Easy-task episodes keep the ticket text mostly visible and focus on grounded single-ticket routing.
Contextual Full Routing
Perform full helpdesk routing with partial observability and moderate queue carry-over. Some tickets hide related-case, requester-history, or cluster-coordination details until you investigate or request more information, and medium episodes can also require deferral or coherent handling across linked tickets in the same queue.
Adaptive Queue Routing
Perform full helpdesk routing by selecting the best issue type, priority, assignment group, and resolution action for the ticket. Use any ambiguity notes, related-ticket previews, queue-capacity forecasts, and planning state when present. Some hard tickets intentionally hide decisive routing context until you investigate with the available tools, and some hard episodes also require queue-level capacity planning, deferrals, incident management, and recovery from downstream follow-up tickets.
Environment Signals
What the agent is balancing
The benchmark is designed so strong policy choices change later tickets, incident coverage, and terminal queue quality instead of just nudging shaped reward.
Hidden context retrieval
Related-ticket previews, requester history, internal routing notes, queue cluster summaries, and capacity forecasts are revealed through explicit tool use.
Operational actions with consequences
Deferrals can raise later urgency, incident handling can reduce downstream debt, and weak handling can spawn or worsen follow-up work.
Queue-level terminal rubric
Final scoring blends routing trajectory quality with queue management quality so agents are rewarded for coherent episode strategy, not just isolated ticket matches.
Quick Routes
Fast ways to demo the environment
Useful entry points for judges, reviewers, or anyone trying to get signal from the project quickly.
Interactive API docs
Browse the full OpenEnv-compatible surface, request models, and built-in helper endpoints.
GET /docs
Open Docs
Task manifest
Inspect the easy, medium, and hard task definitions exactly as exposed by the server.
GET /tasks
View Tasks
Hard-task baseline rollout
See a deterministic baseline episode over the hardest queue with the current environment logic.
GET /baseline?task_id=3&seed=42
Run Baseline
Health and deployment status
Quick check that the service is alive and ready for OpenEnv-style evaluation requests.
GET /health
Check Health