Medicare Advantage chart analysis with ICD-10 detection, CMS-HCC mapping, and MEAT evidence extraction—all running on your infrastructure. No PHI sent to third parties by default. Ambiguous findings are flagged for review, not silently resolved.
Medicare Advantage risk adjustment requires accurate, evidenced ICD-10 coding from real-world clinical charts. Real-world charts are messy: scanned pages, inconsistent formatting, OCR confusables, negation patterns, and buried diagnoses.
This system processes chart PDFs through a governed pipeline that extracts, detects, maps, and flags—without guessing. Every finding carries a chain of custody back to the source page and character offset.
Input
PDF · ZIP batch
Extraction
Native + OCR
Detection
ICD-10 codes
Mapping
CMS-HCC
Evidence
MEAT + provenance
Deployment
Docker / on-prem
Output
Report + CSV
PHI transit
None by default
Upload one chart or a ZIP of many. The system queues them for processing via Celery workers. Charts are stored locally; no data leaves the deployment perimeter by default.
Native PDF text is extracted first. For scanned pages, OCR is applied. OCR quality is assessed per-page; low-confidence pages are flagged with a "needs review" indicator rather than silently passed through.
The detection layer finds ICD-10 code mentions and clinical descriptions in real-world, messy formatting. It handles OCR character confusables (e.g., "I" vs "1") and explicitly models negation — "ruled out" conditions are not reported as active diagnoses.
Detected ICD-10 codes are mapped to CMS-HCC categories using imported CMS mapping tables. The payment year model is configurable. Per-condition risk summaries are generated using the selected model's coefficients.
For each mapped condition, the system extracts evidence supporting Monitoring, Evaluation, Assessment, and Treatment criteria—with the page number and character offset where the evidence was found. Conservative binding means uncertain evidence surfaces as a flag, not a confident assertion.
Output is a structured report and a CSV with per-condition findings, flags (ambiguous binding, no evidence found, OCR issues, negation detected), and provenance references. Every row is independently reviewable.
Conservative by design. When evidence is insufficient, ambiguous, or OCR-degraded, the system flags it. It does not fill gaps with inferences. A "needs review" flag is a feature, not a failure mode.
PHI-safe synthetic charts are included for deterministic regression testing. New releases are validated against these packs before deployment, so a code change cannot silently degrade detection behavior.
Healthcare is the domain where a confident wrong answer causes the most harm. Every design decision in this system reflects that asymmetry.
Detection confidence thresholds are tunable. Below the threshold, a flag is raised—not a binding. Operators can configure threshold levels based on their review capacity and risk tolerance.
Negation patterns ("ruled out," "no evidence of," "history of, resolved") are modeled explicitly. The system does not report negated conditions as present. Negation detections are logged separately.
Every extracted finding includes the source page number and character offset. A reviewer can navigate directly to the supporting text in the original chart. Nothing is asserted without a citable source.
The full pipeline — ingestion, OCR, detection, mapping, reporting — runs inside your infrastructure. No chart data, extracted text, or intermediate results leave your perimeter by default.
PHI-safe synthetic chart packs ship with the system. Releases are validated against these before deployment. Regression-safe by design: detection behavior changes are intentional, not accidental.
CMS-HCC payment year models, detection thresholds, OCR quality flags, and review gate behavior are all configurable by the deploying organization. The system adapts to your policies, not the reverse.
Detection and extraction are AI-assisted. Binding, review, and any downstream use are operator-controlled. The system is designed to be a rigorous first-pass, not a final authority. Read our Governed intelligence, not guesswork framework →
The system is packaged as a Docker Compose stack. It runs entirely within your infrastructure. All components are customer-managed.
API Layer
FastAPI
Workers
Celery
Persistence
PostgreSQL
Queue
Redis
We'll walk through your chart volumes, OCR challenges, and review workflow—and show you what a governed, on-prem deployment looks like in practice.