ASR Pipeline¶

UC-PROC-005: Queue ASR Job¶

Purpose: Schedule audio transcription.

Property	Value
Actor	API Server
Trigger	`POST /api/upload/audio`
Priority	P1

Main Success Scenario:

1. Validate file (MP3/WAV, < 50MB)
2. Upload to S3 `audio/`
3. Push to `asr-queue`
4. Return HTTP 202 Accepted

Acceptance Criteria:

[ ] Accepts common audio formats
[ ] Rejects files > 50MB

UC-PROC-006: Execute Whisper Engine¶

Purpose: Run AI speech-to-text inference.

Property	Value
Actor	GPU Worker
Trigger	Job in `asr-queue`
Priority	P1

Main Success Scenario:

1. Load Whisper model (Large-v3)
   - Keep model loaded in memory if possible (warm start)
2. Run inference: `model.transcribe(audio_path)`
3. Extract `text` and `segments` (timestamps)
4. Update Patient Bundle `transcripts` array
5. Update Job status

Observability:

Metric: asr_inference_time_seconds
Log: {"event": "asr_complete", "duration": 4.5, "audio_len": 30}

Acceptance Criteria:

[ ] Inference time < Audio duration (Real-time factor < 1)
[ ] Preserves timestamps for word alignment

UC-PROC-011: Identify Speaker Turns¶

Purpose: Separate clinician vs patient speech for downstream analytics.

Property	Value
Actor	Diarization Worker
Trigger	Whisper segments available
Priority	P1

Main Success Scenario:

1. Convert audio to 16kHz mono if needed
2. Run pyannote diarization to assign speaker labels per time slice
3. Merge adjacent slices with same label and duration < 500ms gap
4. Map speakers to roles (Clinician, Patient, Caregiver) using heuristic keyword detection
5. Update transcript segments with `speakerRole` and `confidence`
6. Emit `diarization_latency_seconds`

Acceptance Criteria:

[ ] Supports stereo and mono inputs
[ ] Accuracy > 85% on benchmark call set
[ ] Provides override endpoint for manual relabeling

UC-PROC-012: Generate Encounter Note¶

Purpose: Produce a structured SOAP note draft from diarized transcripts.

Property	Value
Actor	Note Composer Service
Trigger	Diarization complete
Priority	P2

Main Success Scenario:

1. Split transcript into sections by speaker role
2. Prompt LLM with template (Subjective, Objective, Assessment, Plan)
3. Extract medications, vitals, and orders into structured JSON
4. Populate Note entity with draft text + structured payload
5. Send notification to clinician for review/attestation
6. Persist revision history for legal traceability

Acceptance Criteria:

[ ] Draft clearly labeled "Auto-generated"
[ ] Captures citations back to transcript timestamps
[ ] Provides API to reject or accept draft with comments