Skip to content

PACS Pipeline

UC-ING-006: Detect PACS File Drop

Purpose: Watch for new imaging metadata files (Trigger).

Property Value
Actor File Watcher Service
Trigger IN_CLOSE_WRITE event on src/data/pacs/
Priority P0

Main Success Scenario:

1. Watcher subscribes to filesystem events
2. Receives event for `study_123.json`
3. Waits 5 sec debounce (avoid reading during write)
4. Validates filename pattern (`study_<id>.json`)
5. Reads file content into memory
6. Moves file to `processing/` folder (atomic move)
7. Enqueues payload to Validation Queue (`pacs_metadata_validation`)

Alternative Flows:

Alt-1: Read Failure - IO Error during read - Log error with path + errno - Move file to `error/` folder - Emit `pacs_ingest_failed` event
Alt-2: Filename Pattern Invalid - File does not match regex - Move to `quarantine/` - Emit `pacs_filename_invalid` event

Acceptance Criteria: 1. [ ] Detects files within 1s of write completion 2. [ ] No partial/zero-byte files ever enqueued 3. [ ] Files are never processed twice (idempotent on filename)


UC-ING-007: Validate Imaging Metadata

Purpose: Ensure JSON payload is syntactically valid (Shape).

Property Value
Actor Validation Worker
Trigger Job in Validation Queue
Priority P0

Main Success Scenario:

1. Parse JSON payload
2. Validate against `pacs_study.schema.json`
   - Required: `studyId`, `modality`, `patientId`, `assetManifest[]`
3. On success, enqueue job to Resolution Queue (`pacs_asset_resolution`)

Alternative Flows:

Alt-1: Schema Violation - Attach validation error details (field, reason) - Enqueue to `pacs_metadata_dead_letter` queue - Emit `pacs_metadata_invalid` event for ops dashboard

Acceptance Criteria: 1. [ ] Strict JSON schema validation (AJV / similar) 2. [ ] Invalid payloads are never sent to Resolution Queue 3. [ ] All validation failures include human-readable error messages


UC-ING-008: Resolve Imaging Assets

Purpose: Verify file existence and compute completeness (File-Level Sanity).

Property Value
Actor Resolution Worker
Trigger Job in Resolution Queue
Priority P0

Main Success Scenario:

1. Iterate over `assetManifest[]`
2. For each asset:
   - Check file exists on disk / S3
   - Verify size > 0
   - (Optional) Basic DICOM sanity check (StudyInstanceUID)
3. Compute:
   - `foundCount`, `missingCount`
   - `completenessScore = foundCount / total`
4. Set status:
   - `Complete` if score = 1.0
   - `Partial` if 0 < score < 1.0
   - `Failed` if score = 0
5. Enqueue Study Resolution Result to `pacs_imaging_update` queue

Alternative Flows:

Alt-1: Storage Unreachable - S3 / NAS down - Do not mark study as Failed - Move job to retry with exponential backoff (max 5 retries) - Emit `pacs_storage_unavailable` event

Acceptance Criteria: 1. [ ] All referenced files are checked exactly once per run 2. [ ] Missing files do not crash the worker; they're counted and surfaced 3. [ ] Resolution is retried on transient storage errors