Skip to content

High-Level Architecture

System Overview

Entheory.AI is built as a modular, event-driven architecture designed for: - Scalability: Handle 10,000+ oncology patients per instance - Interoperability: Ingest from multiple heterogeneous hospital systems - Resilience: Zero data loss, graceful degradation - Extensibility: Easy to add new data sources, processing pipelines


Architecture Diagram

┌────────────────────────────────────────────────────────────────────┐
│                         PHYSICIAN LAYER                             │
│                                                                      │
│   ┌─────────────────────────────────────────────────────────┐     │
│   │  React Web Application (TypeScript)                      │     │
│   │  • Patient List  • Timeline  • Labs  • Imaging View     │     │
│   │  • Responsive Design  • Real-time Updates               │     │
│   └─────────────────────────────────────────────────────────┘     │
└────────────────────────────────────────────────────────────────────┘
                              │  HTTPS/REST
                              ↓
┌────────────────────────────────────────────────────────────────────┐
│                         API GATEWAY LAYER                     │
│                                                                      │
│   ┌──────────────┐      ┌──────────────┐      ┌──────────────┐   │
│   │ REST APIs    │      │ Job Status   │      │ FHIR Export  │   │
│   │ /api/patients│      │ /api/jobs    │      │ ?format=fhir │   │
│   │ /api/upload  │      │              │      │              │   │
│   └──────────────┘      └──────────────┘      └──────────────┘   │
│                 Authentication & RBAC (JWT)                         │
└────────────────────────────────────────────────────────────────────┘
                              │
                              ↓
┌────────────────────────────────────────────────────────────────────┐
│                      APPLICATION LAYER │
│                                                                      │
│   ┌─────────────────────┐         ┌─────────────────────┐         │
│   │  Query Service      │         │  Command Service     │         │
│   │  • Read patient data│         │ • Ingest data        │         │
│   │  • Cache layer      │         │  • Update bundles    │         │
│   │  • FHIR generation  │         │  • Job management    │         │
│   └─────────────────────┘         └─────────────────────┘         │
└────────────────────────────────────────────────────────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        ↓                     ↓                      ↓
┌──────────────────┐   ┌──────────────────┐   ┌──────────────────┐
│ CANONICAL BUNDLES│   │  MESSAGE QUEUES  │   │  OBJECT STORAGE  │
│  (JSON Files)    │   │  (RabbitMQ/SQS)  │   │  (S3/MinIO)      │
│                  │   │                  │   │                  │
│ Per-patient JSON │   │ • Ingestion queue│   │ • PDFs           │
│ bundle.json      │   │ • OCR queue      │   │ • Audio files    │
│                  │   │ • ASR queue      │   │ • DICOM images   │
│ processed_       │   │ • DLQ (errors)   │   │                  │
│ patients.json    │   │                  │   │                  │
└──────────────────┘   └──────────────────┘   └──────────────────┘
        ↑                     ↑                      ↑
        └─────────────────────┴──────────────────────┘
                              │
┌────────────────────────────────────────────────────────────────────┐
│                     INGESTION & PROCESSING LAYER                    │
│                                                                      │
│  ┌────────────┐  ┌─────────┐  ┌──────────┐  ┌────────────┐       │
│  │ HL7 Listener│  │ File    │  │ OCR      │  │ ASR Worker │       │
│  │ (MLLP)      │  │ Watchers│  │ Worker   │  │ (Whisper)  │       │
│  │ • ADT       │  │ • PACS  │  │(Tesseract│  │ • Audio    │       │
│  │ • ORU (Labs)│  │ • Genomics  │ • Eng+Hi │  │ • Eng+Hindi│       │
│  └────────────┘  └─────────┘  └──────────┘  └────────────┘       │
└────────────────────────────────────────────────────────────────────┘
        ↑                ↑                ↑                ↑
        │                │                │                │
┌───────┴────────────────┴────────────────┴────────────────┴────────┐
│                     EXTERNAL HOSPITAL SYSTEMS                       │
│                                                                      │
│  ┌──────────┐  ┌─────────┐  ┌──────────┐  ┌──────────────┐       │
│  │ EMR/HIMS │  │   LIS   │  │   PACS   │  │ Genomics Lab │       │
│  │ (HL7 ADT)│  │ (HL7 ORU)│  │ (JSON)   │  │  (JSON)      │       │
│  └──────────┘  └─────────┘  └──────────┘  └──────────────┘       │
└────────────────────────────────────────────────────────────────────┘

Component Details

1. Frontend (React Web App)

Tech Stack: - React 18 + TypeScript - React Router for navigation - Recharts for visualizations - Axios for API calls

Key Components: - PatientList.tsx - Search and select patients - PatientOverview.tsx - Summary view - Timeline.tsx - Longitudinal event timeline - LabsView.tsx - Lab results with trends - ImagingView.tsx - Imaging studies - NotesView.tsx - Multilingual clinical notes

State Management: - React Context for global state (current patient) - Local component state for UI interactions

Performance: - Code splitting per route - Lazy loading for large data tables - Caching API responses (stale-while-revalidate)


2. API Gateway

Framework: Express.js / FastAPI

Responsibilities: - Route HTTP requests to appropriate services - JWT authentication and RBAC enforcement - Rate limiting and request validation - CORS handling

Key Endpoints:

GET  /api/patients                    # List patients
GET  /api/patients/:abhaId             # Get patient by ID
GET  /api/patients/:abhaId?format=fhir # Get FHIR bundle
POST /api/upload/document              # Upload for OCR
POST /api/upload/audio                 # Upload for ASR
POST /api/ingest/fhir                  # Ingest FHIR bundle
POST /api/ingest/hl7                   # Ingest HL7 message
GET  /api/jobs/:jobId                  # Job status
GET  /api/datasources                  # Data source health
GET  /health                           # System health check

3. Application Services

Query Service (Read Path)

Purpose: Fast reads for UI

Components: - Cache Layer: Redis or in-memory cache for processed_patients.json - Bundle Loader: Reads canonical patient bundles - FHIR Transformer: Converts bundles to FHIR R4 on demand

Optimization: - Cache hit rate >90% for common queries - Precompute aggregations (test count, event count)


Command Service (Write Path)

Purpose: Data ingestion and bundle updates

Components: - HL7 Parser: Parse ADT, ORU messages - FHIR Parser: Validate and extract FHIR resources - JSON Ingester: Process file-based feeds - Bundle Updater: Atomically update patient bundles - Job Manager: Track async processing jobs

Guarantees: - Atomicity: Bundle updates are atomic (write temp, then rename) - Durability: All writes persisted to disk before ACK - Idempotency: Duplicate messages don't create duplicate data


4. Data Storage

Canonical Patient Bundles

Format: One JSON file per patient

Path: src/data/patients/<abhaId>/bundle.json

Schema:

{
  "patientId": "case_001",
  "abhaId": "ABHA-12345678901",
  "demographics": { ... },
  "cancer": { "site": "Breast", "stage": "IIB", ... },
  "labs": [ ... ],
  "imaging": [ ... ],
  "pathology": [ ... ],
  "genomics": [ ... ],
  "therapy": [ ... ],
  "medications": [ ... ],
  "documents": [ ... ],  // OCR outputs
  "transcripts": [ ... ], // ASR outputs
  "provenance": { "lastUpdated": "...", "sources": [...] }
}

Why JSON files: - Easy to inspect and debug - Version control friendly - No database schema migrations - Simple backup (file copy)

Future: May migrate to PostgreSQL + JSONB for better query performance at scale.


Processed Cache

Format: Single aggregated JSON

Path: src/data/processed_patients.json

Purpose: - Fast patient list queries - Precomputed summaries (test count, latest vitals) - Reduce Bundle I/O for list views

Regeneration: - Triggered after every bundle update - Incremental update (only changed patients)


Object Storage (S3/MinIO)

Stores: - Uploaded PDFs (OCR input) - Audio files (ASR input) - DICOM images (if downloaded from PACS) - Original HL7 messages (for audit/debugging)

Organization:

s3://entheory-hospital1/
  ├─ documents/
  │    └─ <patientId>/
  │         └─ doc_<timestamp>.pdf
  ├─ audio/
  │    └─ <patientId>/
  │         └─ audio_<timestamp>.mp3
  ├─ hl7/
  │    └─ <date>/
  │         └─ oru_<messageId>.txt
  └─ dicom/
       └─ <studyId>/
            └─ <seriesId>/
                 └─ <instanceId>.dcm


5. Message Queues

Technology: RabbitMQ (preferred) or AWS SQS

Queues:

Queue Name Purpose Consumer
lab-ingestion-queue HL7 ORU lab messages Lab ingestion worker
imaging-ingestion-queue PACS JSON feeds Imaging worker
ocr-processing-queue Documents to OCR OCR worker (Tesseract)
asr-processing-queue Audio to transcribe ASR worker (Whisper)
hl7-dlq Failed HL7 messages Manual review
ocr-dlq Failed OCR jobs Manual review
asr-dlq Failed ASR jobs Manual review

Guarantees: - At-least-once delivery - Visibility timeout: 5 minutes (worker crashes, message requeued) - Dead letter queue for repeated failures (after 3 retries)


6. Processing Workers

OCR Worker

Technology: Tesseract 5.x

Process: 1. Dequeue job from ocr-processing-queue 2. Download PDF from S3 3. Detect language (langdetect) 4. Invoke Tesseract: tesseract input.pdf output -l eng|hin --oem 3 --psm 1 5. Extract text and confidence 6. Update patient bundle with extracted text 7. Validate bundle, regenerate cache 8. Mark job completed or failed

Parallelism: 10 workers, each processing 1 document at a time


ASR Worker

Technology: OpenAI Whisper (large-v3)

Process: 1. Dequeue job from asr-processing-queue 2. Download audio from S3 3. Invoke Whisper: whisper audio.mp3 --model large-v3 --language en|hi 4. Extract transcript with timestamps 5. Update patient bundle 6. Validate and cache 7. Mark job completed/failed

Parallelization: GPU-accelerated, 2-3 concurrent jobs (memory bound)


7. External Integrations

HL7 v2 Listener

Technology: MLLP (Minimal Lower Layer Protocol) TCP listener

Port: 2575 (configurable)

Message Types: - ADT^A01 - Admission - ADT^A03 - Discharge - ORU^R01 - Lab results

Flow:

Hospital LIS → TCP/MLLP → HL7 Listener → Parse → Enqueue → ACK

Error Handling: - Malformed messages: Return NACK, log full message - Patient not found: Return ACK, move to DLQ for manual resolution


File Watchers

Technology: Chokidar (Node.js) or inotify (Linux)

Watched Directories:

/mnt/hospital-feeds/
  ├─ pacs/           # Imaging JSON files
  ├─ genomics/       # Genomics reports
  └─ pathology/      # Pathology PDFs

Debounce: 5 seconds (wait for file write completion)

Process: 1. Detect new file 2. Validate JSON schema (if JSON) 3. Create ingestion job 4. Enqueue 5. Move processed file to /processed/<date>/


Data Flow Examples

Example 1: Lab Result Ingestion (HL7)

1. LIS sends HL7 ORU message via MLLP
   ↓
2. HL7 Listener receives, validates, sends ACK
   ↓
3. Parser extracts patient ID, test results
   ↓
4. Normalize to ABHA ID (query mapping table)
   ↓
5. Create job, enqueue to "lab-ingestion-queue"
   ↓
6. Worker dequeues, loads patient bundle
   ↓
7. Append lab results to bundle.labs[]
   ↓
8. Validate bundle (JSON schema)
   ↓
9. Atomically write updated bundle
   ↓
10. Regenerate processed_patients.json cache
   ↓
11. Generate FHIR Observation resources
   ↓
12. Job marked "completed"

Latency: 200-500ms end-to-end


Example 2: Document OCR (Hindi)

1. Physician uploads PDF via UI
   ↓
2. API validates file (size, type)
   ↓
3. API stores PDF in S3 with hash
   ↓
4. API creates OCR job, enqueues
   ↓
5. API returns 202 Accepted with jobId
   ↓
[Async Processing]
6. OCR worker dequeues job
   ↓
7. Worker downloads PDF from S3
   ↓
8. Worker detects language (Hindi)
   ↓
9. Worker runs Tesseract with "hin" pack
   ↓
10. Worker extracts text in Devanagari
   ↓
11. Worker calculates confidence (0.89)
   ↓
12. Worker updates bundle.documents[] with:
    - extractedText
    - language: "hi-IN"
    - confidence: 0.89
   ↓
13. Worker validates bundle
   ↓
14. Worker regenerates cache
   ↓
15. Job marked "completed"

Latency: 30-60 seconds for typical 2-3 page document


Scalability & Performance

Current Limits (Single Instance)

Resource Capacity
Patients 10,000
Concurrent Users 100 clinicians
API Throughput 1000 req/min
HL7 Messages 10,000/day
OCR Jobs 500/day

Scaling Strategies

Horizontal Scaling: - Deploy multiple API servers behind load balancer - Add more queue workers (OCR, ASR) - Shard file storage by hospital/patient ID range

Vertical Scaling: - Increase server RAM for in-memory cache - Add GPUs for faster ASR processing

Caching: - Redis for processed patient cache - CDN for static assets (frontend) - HTTP caching headers for patient API


Security Architecture

Authentication

  • JWT tokens (RS256)
  • Issued by hospital SSO/LDAP
  • Expiration: 8 hours
  • Refresh token flow

Authorization (RBAC)

  • Roles: Oncologist, Nurse, Data Manager, Admin, Read-Only
  • Permissions mapped per API endpoint
  • Row-level security: Physicians see only their department's patients (configurable)

Encryption

  • At Rest: AES-256 for bundles, S3 server-side encryption
  • In Transit: TLS 1.3 for al

l connections - Backups: Encrypted with hospital-provided keys

Audit Logging

  • Every patient data access logged
  • Log fields: userId, patientId, action, timestamp, IP
  • Immutable logs (append-only)
  • Retention: 7 years (compliance requirement)

Deployment Architecture

Option 1: On-Premises (Hospital Data Center)

┌─────────────────────────────────────────┐
│       Hospital Network (10.x.x.x)        │
│                                          │
│  ┌──────────────────────────────────┐  │
│  │  Entheory.AI VM/Container         │  │
│  │  • API Server                      │  │
│  │  • Workers (OCR/ASR)               │  │
│  │  • RabbitMQ                        │  │
│  │  • File Storage (NFS/local disk)   │  │
│  └──────────────────────────────────┘  │
│               ↕                        │
│  ┌──────────────────────────────────┐  │
│  │  Hospital Systems                  │  │
│  │  • EMR (HL7 sender)                │  │
│  │  • LIS (Labs)                      │  │
│  │  • PACS (file drop)                │  │
│  └──────────────────────────────────┘  │
└─────────────────────────────────────────┘

Pros: Data never leaves hospital network, meets security policies
Cons: Hospital IT must maintain VM/containers


Option 2: Cloud (AWS/Azure) with VPN

┌──────────────────┐          ┌──────────────────────┐
│ Hospital Network │          │   Cloud VPC          │
│                  │          │                      │
│  EMR, LIS, PACS  │◄─────────┤  Entheory.AI App     │
│                  │ VPN/VPC  │  • API Servers       │
│                  │ Peering  │  • Workers (GPU)     │
│                  │          │  • S3, RDS           │
└──────────────────┘          └──────────────────────┘

Pros: Managed services, GPU for ASR, easier scaling
Cons: Requires VPN setup, data governance approval


Monitoring & Observability

Metrics (Prometheus + Grafana)

System Metrics: - CPU, Memory, Disk usage per service - API latency (p50, p95, p99) - Queue depth and processing lag - Error rates per endpoint

Business Metrics: - Patients ingested per day - Data completeness per modality - OCR/ASR accuracy trends - Clinician active users

Logging (ELK/Loki)

Structured JSON logs:

{
  "timestamp": "2024-12-03T10:15:30Z",
  "level": "INFO",
  "service": "ocr-worker",
  "jobId": "ocr_job_789",
  "event": "ocr_completed",
  "language": "hi-IN",
  "confidence": 0.89,
  "duration_ms": 34500
}

Alerting (PagerDuty/Slack)

Critical Alerts: - API downtime >1 minute - DLQ depth >10 messages - Disk usage >90% - FHIR validation failure rate >10%

Warning Alerts: - OCR confidence <0.70 (manual review needed) - Queue lag >15 minutes - Cache hit rate <80%


Disaster Recovery

Backup Strategy

  • Bundles: Daily automated backup to S3/Azure Blob (encrypted)
  • Object Storage: Native S3 versioning enabled
  • Audit Logs: Replicated to separate region

Recovery Procedures

  • Data Loss: Restore from last night's backup (RPO: 24 hours)
  • Server Failure: Redeploy from Docker image, mount backup storage (RTO: 2 hours)

Testing

  • Quarterly disaster recovery drills
  • Automated restore tests monthly

Document Owner: Tech Lead / Architect
Last Updated: 2024-12-03
Related: Data Model | APIs & Interoperability