Quality & Safety Use Cases (QAS)¶
UC-QAS-001: Record Model Failures¶
Purpose: Log AI failures (hallucinations, crashes) for review.
| Property | Value |
|---|---|
| Actor | QA Monitoring Service |
| Trigger | NLP/ASR error or doctor rejection |
| Priority | P0 |
Main Success Scenario:
1. Detect failure: crash, low confidence, doctor rejection
2. Capture failure context (input, output, model version)
3. Store in failures database
4. Emit alert if failure rate exceeds threshold
5. Create review task for QA team
Acceptance Criteria: 1. [ ] All model errors logged 2. [ ] Failure analysis dashboard available 3. [ ] Alerts trigger at 5% failure rate
UC-QAS-002: Perform Clinical Safety Review¶
Purpose: Manual review of high-risk AI outputs.
| Property | Value |
|---|---|
| Actor | Clinical Safety Officer |
| Trigger | Random sampling or flagged case |
| Priority | P0 |
Main Success Scenario:
1. Reviewer accesses safety review queue
2. Compare AI output vs source audio/transcript
3. Check for clinical accuracy
4. Flag errors: Medication errors, diagnosis errors
5. Provide feedback to ML team
6. Approve or escalate case
Acceptance Criteria: 1. [ ] 5% random sampling of all encounters 2. [ ] Critical errors escalated within 24h 3. [ ] Review metrics tracked
UC-QAS-003: Track Audit Violations¶
Purpose: Monitor compliance violations (access, consent).
| Property | Value |
|---|---|
| Actor | Audit Monitor Service |
| Trigger | Continuous audit log analysis |
| Priority | P1 |
Main Success Scenario:
1. Monitor audit logs for violations:
- Unauthorized access attempts
- Consent violations
- Data export anomalies
2. Flag violations in compliance dashboard
3. Alert security team for critical violations
4. Generate monthly compliance report
Acceptance Criteria: 1. [ ] Real-time violation detection 2. [ ] HIPAA/DPDP compliance checks 3. [ ] Audit-ready reporting
UC-QAS-004: Model Drift Detection¶
Purpose: Detect degradation in ASR/NLP performance.
| Property | Value |
|---|---|
| Actor | ML Ops Service |
| Trigger | Weekly performance analysis |
| Priority | P1 |
Main Success Scenario:
1. Compare current week metrics vs baseline:
- ASR Word Error Rate (WER)
- NLP F1 scores
- Doctor edit frequency
2. If drift detected (>5% degradation):
- Alert ML team
- Trigger model retraining evaluation
3. Log drift metrics
Acceptance Criteria: 1. [ ] Drift detection within 1 week 2. [ ] Automated alerts configured 3. [ ] Historical metrics tracked