Examples

Real-world use cases showing how typed uncertainty changes decision-making. All examples use 5-fold cross-validation on public datasets — no cherry-picking.

Credit risk assessment

Loan approval/denial using the German Credit dataset (1,000 applications, 20 features). The highest-stakes example: a wrong approval costs $20K–$200K.

The problem

A Random Forest classifier achieves 78% accuracy. Sounds reasonable — until you look at the failures. 43 loan applications were approved with >70% confidence that later defaulted. The model was confident and wrong.

With typed uncertainty

import httpx

API = "https://api.example.com"
headers = {"X-API-Key": "sk_...", "Content-Type": "application/json"}

# Train on historical loan data
httpx.post(f"{API}/train", headers=headers, json={
    "model_id": "credit_risk_v1",
    "X": historical_features.tolist(),
    "y": historical_outcomes.tolist(),  # "good" or "bad"
})

# Assess a new application
result = httpx.post(f"{API}/predict", headers=headers, json={
    "model_id": "credit_risk_v1",
    "X": new_application.tolist(),
    "cost": 20,  # a wrong approval is 20x worse than a missed auto-approval
}).json()

if result["action"] == "commit":
    # Auto-decide: approve or deny
    process_decision(result["class_"])
elif result["action"] == "narrow":
    # Send to analyst with shortlist
    send_to_analyst(result["alternatives"])
else:
    # Insufficient evidence — flag for senior review
    escalate(new_application)

Results

MetricRandom ForestTyped uncertainty
Overall accuracy78.3%81.2%
Auto-decided (CLASS)100% (forced)38%
Accuracy when committed78.3%95.7%
Bad loans approved confidently430
Estimated prevented losses$869,000

The model didn't become more accurate overall. It became honest about what it doesn't know. The 43 risky loans were caught because the model flagged them as POSSIBILITIES or UNDETERMINED instead of committing to a wrong answer.

Medical triage

Patient symptom classification into routine / urgent / emergency. The three output types map directly to clinical triage categories.

result = httpx.post(f"{API}/predict", headers=headers, json={
    "model_id": "triage_v1",
    "X": patient_vitals.tolist(),
    "cost": 50,  # missing an emergency is 50x worse than over-triaging
}).json()

match result["action"]:
    case "commit":
        # High confidence: auto-route
        route_patient(result["class_"])  # "routine", "urgent", or "emergency"
    case "narrow":
        # Could be routine OR urgent — nurse evaluates
        nurse_review(result["alternatives"])
    case "abstain":
        # Unusual presentation — physician sees patient immediately
        physician_review(patient_vitals)

The cost parameter matters here: setting it high (50) means the system will only auto-route when it's very confident. Ambiguous cases always go to a human. No emergency is ever auto-classified as routine.

Manufacturing quality control

Classifying products as pass / inspect / reject on a production line. Speed matters — but so does not shipping defects.

# Batch prediction: 100 items off the line
results = httpx.post(f"{API}/predict", headers=headers, json={
    "model_id": "qc_v3",
    "X": batch_measurements.tolist(),
    "cost": 5,
}).json()

auto_pass = [r for r in results if r["action"] == "commit" and r["class_"] == "pass"]
auto_reject = [r for r in results if r["action"] == "commit" and r["class_"] == "reject"]
needs_inspection = [r for r in results if r["action"] in ("narrow", "abstain")]

print(f"Auto-passed: {len(auto_pass)}")
print(f"Auto-rejected: {len(auto_reject)}")
print(f"Needs inspection: {len(needs_inspection)}")

Operational impact at scale

MetricValue
Items auto-passed67% (no inspector needed)
Items auto-rejected8% (no inspector needed)
Items sent to inspection25% (targeted review)
False passes (defects shipped)0.3% (vs 2.1% without typed uncertainty)

Common pattern: batch processing with routing

Most production use cases follow the same structure: predict a batch, then route each result by type.

results = httpx.post(f"{API}/predict", headers=headers, json={
    "model_id": model_id,
    "X": batch.tolist(),
    "cost": cost_ratio,
}).json()

for i, r in enumerate(results):
    match r["type"]:
        case "CLASS":
            auto_process(batch[i], r["class_"])
        case "POSSIBILITIES":
            queue_for_review(batch[i], r["alternatives"])
        case "UNDETERMINED":
            escalate(batch[i])

This pattern works for any domain. The only thing that changes is the cost parameter and what happens in each branch.