CoSignal Agentic Risk Detection Case Study
| 3 Agents in pipeline | 5 Live API calls | 4 Risk rules triggered | 7 AI-generated interventions |
Executive Summary
CoSignal is an agentic capability risk detection system built to surface workforce performance gaps before they become operational failures. This case study documents the design, build, and live pipeline run of CoSignal V1 — a portfolio project demonstrating agentic pipeline architecture, provider-agnostic AI design, and measurement-first L&D systems thinking.
The scenario: a Microsoft Dynamics 365 warehouse rollout, three weeks post go-live, 30 employees across three workflow areas. CoSignal ingested simulated signal data, classified risk across all three areas, and generated seven AI-written interventions via live Anthropic API calls — all in a single pipeline run.
| CoSignal V1 is not a concept or a mockup. Every number in this case study came from a live pipeline run using real Anthropic API calls. The intervention text was generated by claude-sonnet-4-6 reading actual signal data. |
Key Results
| Outcome | Result |
| Pipeline execution | 3 agents ran in sequence, all stages validated |
| API calls | 5 live calls to claude-sonnet-4-6 via streaming |
| Risk classification | Inventory Adjustments = HIGH (4 rules), Receiving = MODERATE, Scanner = WATCH |
| Interventions generated | 7 total — 5 AI-generated by Claude, 2 static fallbacks for WATCH area |
| Provider portability | AI_PROVIDER=azure activates Azure OpenAI — zero code changes required |
The Problem CoSignal Solves
Large technology rollouts fail when capability gaps surface too late. The typical pattern in a D365 implementation looks like this:
- Week 1-2: Training delivered, completion rates high, leadership satisfied
- Week 3-4: Support ticket volume climbs, employees call the help desk for basic tasks
- Week 5-6: Supervisors report frustration, employees revert to spreadsheets and workarounds
- Week 8+: Leadership asks why the rollout is underperforming — but the damage is done
Traditional reporting identifies these problems after operational damage has occurred. There is no mechanism to detect the capability gap while it is still emerging — before it becomes a business problem.
| CoSignal answers one question: where are capability risks emerging right now, before they show up as operational failures? It does not replace training. It detects when training is not transferring and prescribes targeted responses before the window closes. |
The D365 Warehouse Context
The V1 scenario is grounded in the real dynamics of a D365 warehouse implementation. Three workflow areas are particularly vulnerable post-go-live:
| Workflow Area | Typical Post-Go-Live Risk |
| Inventory Adjustments | Complex multi-step process with high exception volume — most common source of errors |
| Receiving | High transaction frequency with tight timing requirements — confidence gaps surface quickly |
| Scanner Operations | Device-dependent workflow — near-threshold risk accumulates before rules formally trigger |
System Architecture
CoSignal is built on the CoBuild pipeline formula: sequential Python agent scripts, a run_pipeline.py orchestrator, file-based intermediate outputs, and a provider-agnostic AI layer. Each agent has one job. Every intermediate state is inspectable.
The Three-Agent Pipeline
| seed_data.json ↓ signal_aggregator.py → aggregated_signals.json [pure Python, no AI] ↓ risk_detection.py → risk_scores.json [pure Python, no AI] ↓ prescription_engine.py → prescriptions.json [claude-sonnet-4-6, streaming] ↓ llm_client.py [Anthropic OR Azure OpenAI] ↓ React Dashboard ← FastAPI /results endpoint |
Agent 1: Signal Aggregator — Pure Python
Reads seed_data.json and normalizes three signal types per workflow area into a unified signal object. No AI call — deterministic aggregation.
| Signal Type | Fields Captured |
| Support Tickets | ticket_volume_7d, unresolved_tickets_48h, unresolved_rate |
| LMS Records | avg_quiz_score, high_retake_count, completion_pct |
| Pulse Survey | avg_d365_confidence, avg_process_clarity, change_readiness |
Agent 2: Risk Detection — Pure Python
Applies four threshold-based detection rules. Classification is deterministic — same inputs always produce the same output. No AI call.
| Detection Rule | Threshold |
| Process Confusion | ticket_volume_7d > 8 |
| Knowledge Decay | avg_quiz_score > 80 AND high_retake_count > 2 |
| Confidence Gap | avg_d365_confidence < 3.0 OR avg_process_clarity < 3.0 |
| SME Bottleneck | unresolved_tickets_48h / ticket_volume_7d > 0.60 |
| Risk Level | Classification Logic |
| HIGH | 2 or more detection rules triggered for the same workflow area |
| MODERATE | Exactly 1 detection rule triggered |
| WATCH | No rules triggered, but any signal within 20% of a threshold |
| OK | No rules triggered, no signals near threshold |
Agent 3: Prescription Engine — Calls Claude
For each HIGH and MODERATE risk area, sends the signal data to claude-sonnet-4-6 via the Anthropic streaming API and generates a specific, grounded intervention. The system prompt instructs Claude to reference the actual numbers. WATCH areas receive static fallback interventions with no API call.
llm_client.py — Provider-Agnostic Layer
The single most important architectural decision in CoSignal V1. All provider-specific code lives in one file. The Prescription Engine imports only from llm_client — it never imports anthropic or openai directly. Switching providers is a .env change, not a code change.
| # .env AI_PROVIDER=anthropic # or: AI_PROVIDER=azure # That is the entire switch. # Pipeline, detection logic, dashboard — zero changes. |
Live Pipeline Run — Results
The following results are from a live pipeline run on May 24, 2026, using the Anthropic API with claude-sonnet-4-6. All intervention text was generated in real time.
Risk Classification Output
| Workflow Area | Risk Level |
| Inventory Adjustments | HIGH (Priority 1) |
| Receiving | MODERATE (Priority 2) |
| Scanner Operations | WATCH (Priority 3) |
| The Inventory Adjustments finding is notable: quiz score 85.5 suggests employees know the material, but 64% of tickets remain unresolved past 48 hours. This is the Knowledge Decay pattern — surface-level recall that does not transfer to operational performance under pressure. Claude’s prescriptions directly address this gap. |
AI-Generated Prescriptions — Inventory Adjustments (HIGH)
Four interventions were generated — one per triggered rule. Reproduced as generated by claude-sonnet-4-6:
1. Inventory Adjustment Process Clarity Sprint
Triggered by: Process Confusion
| Immediately convene a 90-minute hands-on session with all 10 inventory team members where they walk through the exact adjustment scenarios generating the 14 support tickets, using live D365 sandbox data. Create a one-page quick-reference card mapping each adjustment type (write-offs, transfers, corrections) to its specific D365 transaction path, and post it at each workstation. Assign a designated “process buddy” from the 2-3 highest-confidence team members to field same-day questions for the next two weeks, targeting resolution of the 9 tickets stalled beyond 48 hours. Rationale: With process clarity averaging 2.8 and confidence at 2.6 across a small 10-person team, the volume and age of unresolved tickets indicates workers lack a reliable mental model of which D365 path to follow for which scenario — making structured, scenario-based process reinforcement more urgent than additional system training. |
2. Inventory Adjustments Hands-On Knowledge Reinforcement Sprint
Triggered by: Knowledge Decay
| Schedule two 90-minute in-person lab sessions this week where all 10 employees work through the five most common inventory adjustment scenarios that generated the 14 support tickets, using live D365 sandbox data. Pull the 9 unresolved tickets and convert them into the session’s practice cases, so staff resolve real problems with a trainer present. Identify the 4 employees with 2+ quiz retakes and pair each with a high-performing peer for a dedicated 30-minute walkthrough of their specific failure points before the group sessions begin. Rationale: Quiz scores sitting just above the 80-point threshold, combined with 64% of tickets unresolved past 48 hours, signal that staff have surface-level recall but cannot apply knowledge under operational pressure — requiring practice-based reinforcement against real failure cases rather than additional instructional content. |
3. Redistribute Inventory Adjustment Support Load Now
Triggered by: SME Bottleneck
| Identify the one or two employees currently handling all escalated tickets and immediately pair them with the 4 employees who have retaken quizzes twice or more, running two 45-minute live walkthroughs this week using the actual unresolved tickets as practice cases. Simultaneously, create a one-page quick-reference card covering the top three recurring ticket issues and post it in the team’s shared channel so all 10 employees can self-serve before escalating. Rationale: With 64% of tickets stalled beyond 48 hours and only a 2.6 confidence score across the team, a single point of SME dependency is creating a resolution backlog that will worsen as go-live volume increases, making immediate load redistribution more urgent than any training or system fix. |
AI-Generated Prescription — Receiving (MODERATE)
Receiving Team Hands-On Confidence Building Sessions
Triggered by: Confidence Gap
| Schedule two 90-minute supervised practice sessions this week where Receiving staff complete live D365 receiving transactions alongside a system-proficient coach, using actual purchase orders from the past 7 days. Pair the three employees with unresolved tickets directly with the coach first, walking through their specific stuck points in the system rather than generic training. End each session with a 5-minute “I can do this alone” checklist that staff self-sign to mark which tasks they now feel ready to perform independently. Rationale: With average confidence at 2.85 and half of open tickets unresolved past 48 hours, staff are hesitant to act without reassurance, meaning guided repetition on real tasks will close the confidence gap faster than additional instructional content. |
Key Architectural Decisions
1. Pure Python for Detection, Claude for Language
Signal aggregation and risk classification are deterministic processes. There is no reason to involve an AI model in threshold comparison. Using pure Python for Agents 1 and 2 means the detection logic is auditable, fast, free to run, and always produces the same output from the same inputs.
Claude is invoked only in Agent 3, where the task is language generation — taking structured risk data and producing specific, readable intervention text. This is the right use of a language model. Classification is not.
2. Structured Markdown Output, Not JSON
The Prescription Engine instructs Claude to return output using labeled plain-text sections rather than JSON. This is a lesson learned from the CoBuild pipeline: JSON responses truncate silently when output is large, producing parse errors. Structured markdown fails gracefully — a partial response produces a partial file, not a crash.
3. Provider Isolation in One File
llm_client.py is the only file in the codebase that imports anthropic or openai. Every other file — including the Prescription Engine — imports only from llm_client. This means the provider can change without touching any business logic, and the codebase can be audited for external API usage by checking one file.
Azure Deployment Guide
CoSignal V1 is designed to run inside enterprise Azure environments. The switch from Anthropic to Azure OpenAI requires a .env change and nothing else.
Prerequisites
- Active Azure subscription (the same subscription used for D365 if applicable)
- Azure OpenAI resource provisioned: Azure Portal → Create Resource → Azure OpenAI
- A deployed model — gpt-4o recommended: Azure OpenAI Studio → Deployments → Create
- CoSignal V1 running on Anthropic first — verify the Anthropic run before switching
The Switch
| # Step 1: In backend/.env, change one line: AI_PROVIDER=azure # Step 2: Fill in Azure credentials: AZURE_OPENAI_ENDPOINT=https://your-resource.openai.azure.com/ AZURE_OPENAI_API_KEY=your-key-from-azure-portal AZURE_OPENAI_DEPLOYMENT=gpt-4o AZURE_OPENAI_API_VERSION=2024-02-01 # Step 3: Run normally. Nothing else changes. python backend/run_pipeline.py # Expected output: CoSignal Pipeline Orchestrator AI Provider: azure Calling gpt-4o (streaming) via Azure OpenAI… [ok] Agent 3 — Prescription Engine — prescriptions.json verified |
What Does Not Change
| Component | Changes for Azure? |
| signal_aggregator.py | No — pure Python, no API call |
| risk_detection.py | No — pure Python, no API call |
| prescription_engine.py | No — imports only from llm_client |
| run_pipeline.py | No — orchestrator is provider-agnostic |
| main.py (FastAPI) | No — serves results regardless of provider |
| React dashboard | No — reads ai_provider from prescriptions.json |
| Detection thresholds | No — entirely independent of AI layer |
| Output JSON schemas | No — identical from both providers |
| For Greentree and government D365 implementations: Azure OpenAI Service runs on the same Azure subscription used for D365. Adding CoSignal requires one resource deployment and one .env change. No new vendor contracts. No data leaving the Azure boundary. No security review for external API calls. |
V2 Roadmap
CoSignal V1 demonstrates the architecture. V2 connects it to real data sources.
| V1 (Current) | V2 (Planned) |
| Simulated seed data | Real signals from D365, ServiceNow, LMS APIs |
| Single warehouse location | Multi-location risk aggregation and comparison |
| Static detection thresholds | Adaptive thresholds based on historical baseline |
| Manual pipeline execution | Scheduled execution via Azure Functions or Logic Apps |
| Local JSON file storage | Azure Blob Storage or Azure SQL |
| React dashboard on localhost | Azure Static Web Apps or Power BI embedded |
| No alert delivery | Email/Teams alerts for HIGH risk classifications |
Portfolio Summary
CoSignal V1 demonstrates four capabilities relevant to AI L&D systems roles and enterprise technology consulting:
- Agentic pipeline design — CoBuild formula: sequential agents, run_pipeline.py orchestrator with –from N restart, per-stage validation, inspectable intermediate outputs
- Provider-agnostic AI architecture — llm_client.py isolation pattern, Anthropic to Azure OpenAI with one environment variable, zero code changes
- Selective AI use — deterministic Python for classification, Claude only for language generation where it adds value
- Measurement-first L&D thinking — performance signals drive intervention recommendations, not training completion rates
| CoSignal positions at the intersection of enterprise L&D, AI systems design, and workforce performance analytics — exactly where D365 implementation firms need capability leadership. |
| Jason Bouchard Blue EdgeWater • AI L&D Systems Design • May 2026 blueedgewater.com |