Sample Deliverable
A 2-page excerpt from a real Architecture Sprint deliverable. The names are changed. The thinking is real.
This excerpt is from a completed Architecture Sprint for a mid-market insurance company processing 200+ broker submissions per month. The full deliverable runs 47 pages. What you see below covers the five sections that matter most: executive summary, system architecture, cost model, risk assessment, and transition plan. Read it before you book a call — this is the quality and depth you’re paying for.
Architecture Sprints run $5–8K over 1–2 weeks. This is what you own at the end.
Meridian Underwriting processes over 200 insurance submissions per month. Each submission arrives as an unstructured PDF package — applications, loss runs, financial statements, broker notes — and an underwriter manually extracts the relevant data to populate rating systems. This takes 3 to 4 hours per submission. At current volume, that is 700+ hours of manual data entry per month. The 8% field-level error rate generates downstream rework: incorrect quotes sent to brokers, re-keying after QA catches discrepancies, and occasional post-bind corrections that damage broker relationships. The problem is not that underwriters are slow. The problem is that the work is mechanical extraction, not underwriting judgment, and it consumes the majority of their day.
We recommend a hybrid extraction pipeline with confidence-based routing. The system classifies each page of a submission package by document type, extracts structured fields using Azure Document Intelligence, and scores extraction confidence at the field level. Pages with high confidence (85%+) flow directly to validation and rating system integration. Pages below the confidence threshold — primarily those with handwritten broker notes — route to a human review queue where an underwriter corrects or confirms extracted values. Phase 1 delivers this pipeline. Phase 2 — automated underwriting decisioning based on extracted patterns — is explicitly deferred. Decisioning rules require 6 months of clean extraction data to validate. Building them now would mean encoding assumptions we cannot yet test.
We recommend proceeding with Phase 1. We recommend against building Phase 2 until you have 6 months of extraction data to validate patterns. Attempting full automation now would produce confident errors on the 40% of submissions with handwritten components — creating more downstream rework than the current manual process.
Forty percent of Meridian’s submissions include handwritten broker notes where OCR accuracy falls between 72% and 81%. A system that extracts these fields and passes them to the rating engine with high confidence scores would generate plausible but incorrect quotes. Those errors would be discovered post-bind or not at all. The current manual process, while slow, does not have this failure mode. Phase 1 routes low-confidence pages to human review — preserving accuracy while cutting processing time from 3.5 hours to 45 minutes. Phase 2 becomes viable once you have baseline accuracy data across document types and can define decisioning rules from observed patterns rather than assumptions.
Phase 1 build: $24,000. Monthly run cost: ~$2,400. 12-month TCO: $52,800. Projected annual savings from reduced processing time: $363,000.
Submission intake pipeline — Meridian Underwriting
Broker portal, email attachments, S3 landing zone
Document type detection: application, loss run, financial, supplement, broker notes
Field-level confidence, schema validation, missing field detection
Low-confidence extractions, corrections feed training signal
API push to Guidewire, audit trail maintained
Accuracy tracking per doc type, confidence threshold tuning, processing dashboards
Commercial document specialization beats general-purpose OCR on insurance forms. Meridian’s existing Azure footprint means no new cloud vendor. Table extraction accuracy 94% vs Textract’s 87% on insurance applications.
Rating data requires 99%+ accuracy. Vision models achieve ~92% on handwritten insurance forms. The 7% gap means 14 incorrect submissions per month at Meridian’s volume. Human review is the correct architecture for this subset, not a temporary workaround.
Automated underwriting decisioning needs baseline extraction patterns that don’t exist yet. Building decisioning rules on 0 days of clean data guarantees overfitting. Six months of Phase 1 data creates the foundation Phase 2 requires.
| Phase | Hours | Rate | Subtotal |
|---|---|---|---|
| Discovery & constraint mapping | 16 | $200 | $3,200 |
| Document classification model | 24 | $200 | $4,800 |
| Extraction pipeline (digital path) | 32 | $200 | $6,400 |
| Extraction pipeline (handwritten routing) | 20 | $200 | $4,000 |
| Rating system API integration | 16 | $200 | $3,200 |
| Testing, deployment & documentation | 12 | $200 | $2,400 |
| Total | 120 hrs | $24,000 |
| Component | Volume / Notes | Cost/Month |
|---|---|---|
| Azure Document Intelligence | 3,000 pages (custom model) | $600 |
| Azure Container Apps (2 vCPU, 4GB) | Pipeline + review queue | $500 |
| PostgreSQL Flexible Server | 100GB, General Purpose | $350 |
| Blob Storage (document archive) | ~50GB/month growth | $50 |
| Application Insights + Log Analytics | Monitoring & alerting | $150 |
| Claude 3.5 Haiku (classification) | ~100K tokens/month | $100 |
| Azure Service Bus + Container Registry | Queue orchestration, CI/CD | $330 |
| Operational buffer (15%) | $320 | |
| Total | ~$2,400 |
Note: Assumes Meridian’s existing Azure AD tenant. Greenfield Azure adds $300–400/mo.
Build: $24,000 + Run (12 months): $28,800 = $52,800
200 submissions/month × 3.5 hrs × $55/hr fully loaded underwriter cost = $462K/year in processing labor. The system reduces processing to 0.75 hrs/submission = $99K/year. Net savings: $363K/year after system costs.
40% of submissions include handwritten broker notes. Azure Document Intelligence returns 72–81% confidence on these pages. The rating system requires 99%+ field-level accuracy.
Impact: Plausible but incorrect field values — policy limits, deductible amounts, named insureds. Errors discovered post-bind by claims adjusters, or not discovered at all until a claim is filed against incorrect terms.
Mitigation: Route all pages below 85% extraction confidence to the human review queue. Do not use a vision model fallback. Vision models achieve approximately 92% accuracy on handwritten insurance forms — that is meaningfully better than Document Intelligence alone, but the gap between 92% and the required 99% is too large for rating data. Human review is the correct architecture for this subset, not a stopgap.
Cost: ~$1,200/month in human review labor. Built into the operational model above.
Meridian’s Guidewire rating platform has a documented API, but the field mapping between extracted data and rating inputs is maintained in spreadsheets by the actuarial team. There is no single source of truth. Three different mapping conventions exist across commercial lines.
Impact: Extraction works correctly but populates the wrong rating fields. These errors are silent — they produce valid but incorrect quotes. A broker receives a quote that looks normal but is priced against the wrong classification or limits.
Mitigation: Dedicate 16 hours of discovery specifically to field mapping across all three commercial line conventions. Build the mapping as an editable configuration layer, not hardcoded logic. Validate with the actuarial team before production. Run 2 weeks of parallel processing — system output alongside manual processing — before cutover.
Cost: Discovery time included in the build estimate. Parallel processing period adds ~$8K in temporary dual-processing labor.
Two of six underwriters expressed skepticism about AI accuracy during stakeholder interviews. Both have 10+ years of tenure and have developed personal review workflows. If they bypass the system and continue processing manually, the ROI projections above do not hold.
Impact: The system is technically successful but underutilized. Processing time savings materialize for four underwriters but not six, reducing realized savings by approximately one-third.
Mitigation: Start with the four underwriters who expressed interest. Measure processing time and error rate for 60 days. Use that data — not mandates — to demonstrate value to the skeptical team members. The review UI is designed as an assistance tool: underwriters approve and correct, they do not just receive. The system augments their judgment rather than replacing it.
Cost: No incremental cost. Full adoption may take 90 days rather than 30.
Weeks 1–2: Infrastructure provisioning, Azure Document Intelligence configuration, document classification model training against Meridian’s submission corpus. Establish CI/CD pipeline and monitoring baseline.
Weeks 3–4: Extraction pipeline build for digital and handwritten document paths. Field-level confidence scoring implementation. Human review queue UI — designed with underwriter input, not delivered as a finished product they have never seen.
Week 5: Rating system API integration. Field mapping validation with the actuarial team across all three commercial line conventions. End-to-end testing against historical submissions.
Week 6: Parallel processing validation — system runs alongside manual processing for the final week. Deployment to production. Runbook and operational documentation delivered. Handoff to Meridian’s ops team or ongoing engineering engagement.
Requires: 6 months of extraction data, established confidence thresholds per document type, measured accuracy baselines across all three commercial line conventions.
Scope: Automated decisioning rules for high-confidence submissions, pattern-based submission routing, broker-facing status portal showing extraction progress and review queue position.
Estimated cost: $35K–$45K build. This estimate will be refined based on Phase 1 learnings — specifically, what percentage of submissions achieve full-confidence extraction without human review after 6 months of threshold tuning.
Every Architecture Sprint starts with your problem, your data, your constraints. The methodology is the same. The recommendations are unique to you.
No pitch deck. No pressure. We’ll tell you if a sprint makes sense for your situation — and if it doesn’t.