Skip to content

High-Level Architecture

System Overview

Entheory.AI is built as a modular, event-driven architecture designed for:

  • Scalability: Handle 10,000+ oncology patients per instance
  • Interoperability: Ingest from multiple heterogeneous hospital systems
  • Resilience: Zero data loss, graceful degradation
  • Extensibility: Easy to add new data sources, processing pipelines

Architecture Diagram

┌────────────────────────────────────────────────────────────────────┐
│                         PHYSICIAN LAYER                             │
│                                                                      │
│   ┌─────────────────────────────────────────────────────────┐     │
│   │  React Web Application (TypeScript)                      │     │
│   │  • Patient List  • Timeline  • Labs  • Imaging View     │     │
│   │  • Responsive Design  • Real-time Updates               │     │
│   └─────────────────────────────────────────────────────────┘     │
└────────────────────────────────────────────────────────────────────┘
                              │  HTTPS/REST
                              ↓
┌────────────────────────────────────────────────────────────────────┐
│                         API GATEWAY LAYER                     │
│                                                                      │
│   ┌──────────────┐      ┌──────────────┐      ┌──────────────┐   │
│   │ REST APIs    │      │ Job Status   │      │ FHIR Export  │   │
│   │ /api/patients│      │ /api/jobs    │      │ ?format=fhir │   │
│   │ /api/upload  │      │              │      │              │   │
│   └──────────────┘      └──────────────┘      └──────────────┘   │
│                 Authentication & RBAC (JWT)                         │
└────────────────────────────────────────────────────────────────────┘
                              │
                              ↓
┌────────────────────────────────────────────────────────────────────┐
│                      APPLICATION LAYER │
│                                                                      │
│   ┌─────────────────────┐         ┌─────────────────────┐         │
│   │  Query Service      │         │  Command Service     │         │
│   │  • Read patient data│         │ • Ingest data        │         │
│   │  • Cache layer      │         │  • Update bundles    │         │
│   │  • FHIR generation  │         │  • Job management    │         │
│   └─────────────────────┘         └─────────────────────┘         │
└────────────────────────────────────────────────────────────────────┘
                              │
        ┌─────────────────────┼─────────────────────┐
        ↓                     ↓                      ↓
┌──────────────────┐   ┌──────────────────┐   ┌──────────────────┐
│ CANONICAL BUNDLES│   │  MESSAGE QUEUES  │   │  OBJECT STORAGE  │
│  (JSON Files)    │   │  (Kafka/NATS/RabbitMQ)  │   │  (S3/MinIO)      │
│                  │   │                  │   │                  │
│ Per-patient JSON │   │ • Ingestion queue│   │ • PDFs           │
│ bundle.json      │   │ • OCR queue      │   │ • Audio files    │
│                  │   │ • ASR queue      │   │ • DICOM images   │
│ processed_       │   │ • DLQ (errors)   │   │                  │
│ patients.json    │   │                  │   │                  │
└──────────────────┘   └──────────────────┘   └──────────────────┘
        ↑                     ↑                      ↑
        └─────────────────────┴──────────────────────┘
                              │
┌────────────────────────────────────────────────────────────────────┐
│                     INGESTION & PROCESSING LAYER                    │
│                                                                      │
│  ┌────────────┐  ┌─────────┐  ┌──────────┐  ┌────────────┐       │
│  │ HL7 Listener│  │ File    │  │ OCR      │  │ ASR Worker │       │
│  │ (MLLP)      │  │ Watchers│  │ Worker   │  │ (Whisper)  │       │
│  │ • ADT       │  │ • PACS  │  │(Tesseract│  │ • Audio    │       │
│  │ • ORU (Labs)│  │ • Genomics  │ • Eng+Hi │  │ • Eng+Hindi│       │
│  └────────────┘  └─────────┘  └──────────┘  └────────────┘       │
└────────────────────────────────────────────────────────────────────┘
        ↑                ↑                ↑                ↑
        │                │                │                │
┌───────┴────────────────┴────────────────┴────────────────┴────────┐
│                     EXTERNAL HOSPITAL SYSTEMS                       │
│                                                                      │
│  ┌──────────┐  ┌─────────┐  ┌──────────┐  ┌──────────────┐       │
│  │ EMR/HIMS │  │   LIS   │  │   PACS   │  │ Genomics Lab │       │
│  │ (HL7 ADT)│  │ (HL7 ORU)│  │ (JSON)   │  │  (JSON)      │       │
│  └──────────┘  └─────────┘  └──────────┘  └──────────────┘       │
└────────────────────────────────────────────────────────────────────┘

Interactive Architecture Diagram

flowchart TB
    subgraph External["External Hospital Systems"]
        EMR["EMR/HIMS<br/>HL7 ADT"]
        LIS["LIS<br/>HL7 ORU"]
        PACS["PACS<br/>DICOM"]
        Genomics["Genomics Lab<br/>JSON"]
    end

    subgraph Ingestion["Ingestion & Processing Layer"]
        HL7["HL7 Listener<br/>MLLP:2575"]
        FileWatch["File Watchers"]
        OCR["OCR Worker<br/>Tesseract"]
        ASR["ASR Worker<br/>Whisper"]
    end

    subgraph Storage["Data Storage"]
        Bundles[("Canonical Bundles<br/>JSON per patient")]
        Queue[("Message Queues<br/>Kafka/NATS/RabbitMQ")]
        S3[("Object Storage<br/>S3/MinIO")]
    end

    subgraph App["Application Layer"]
        Query["Query Service<br/>Read Path"]
        Command["Command Service<br/>Write Path"]
    end

    subgraph API["API Gateway Layer"]
        REST["REST APIs<br/>/api/patients"]
        FHIR["FHIR Export"]
        Jobs["Job Status"]
    end

    subgraph UI["Physician Layer"]
        React["React Web App<br/>Patient Timeline | Labs | Imaging"]
    end

    EMR --> HL7
    LIS --> HL7
    PACS --> FileWatch
    Genomics --> FileWatch

    HL7 --> Queue
    FileWatch --> Queue

    Queue --> OCR
    Queue --> ASR

    OCR --> Bundles
    ASR --> Bundles
    S3 --> OCR
    S3 --> ASR

    Command --> Bundles
    Command --> Queue
    Query --> Bundles

    REST --> Query
    REST --> Command
    FHIR --> Query
    Jobs --> Command

    React --> REST
    React --> FHIR

    style External fill:#f9f,stroke:#333
    style Ingestion fill:#bbf,stroke:#333
    style Storage fill:#fbb,stroke:#333
    style App fill:#bfb,stroke:#333
    style API fill:#fbf,stroke:#333
    style UI fill:#ff9,stroke:#333

Component Details

1. Frontend (React Web App)

Tech Stack:

  • React 18 + TypeScript
  • React Router for navigation
  • Recharts for visualizations
  • Axios for API calls

Key Components:

  • PatientList.tsx - Search and select patients
  • PatientOverview.tsx - Summary view
  • Timeline.tsx - Longitudinal event timeline
  • LabsView.tsx - Lab results with trends
  • ImagingView.tsx - Imaging studies
  • NotesView.tsx - Multilingual clinical notes

State Management:

  • React Context for global state (current patient)
  • Local component state for UI interactions

Performance:

  • Code splitting per route
  • Lazy loading for large data tables
  • Caching API responses (stale-while-revalidate)

2. API Gateway

Framework: Express.js / FastAPI

Responsibilities:

  • Route HTTP requests to appropriate services
  • JWT authentication and RBAC enforcement
  • Rate limiting and request validation
  • CORS handling

Key Endpoints:

GET  /api/patients                    # List patients
GET  /api/patients/:abhaId             # Get patient by ID
GET  /api/patients/:abhaId?format=fhir # Get FHIR bundle
POST /api/upload/document              # Upload for OCR
POST /api/upload/audio                 # Upload for ASR
POST /api/ingest/fhir                  # Ingest FHIR bundle
POST /api/ingest/hl7                   # Ingest HL7 message
GET  /api/jobs/:jobId                  # Job status
GET  /api/datasources                  # Data source health
GET  /health                           # System health check

3. Application Services

Query Service (Read Path)

Purpose: Fast reads for UI

Components:

  • Cache Layer: Redis or in-memory cache for processed_patients.json
  • Bundle Loader: Reads canonical patient bundles
  • FHIR Transformer: Converts bundles to FHIR R4 on demand

Optimization:

  • Cache hit rate >90% for common queries
  • Precompute aggregations (test count, event count)

Command Service (Write Path)

Purpose: Data ingestion and bundle updates

Components:

  • HL7 Parser: Parse ADT, ORU messages
  • FHIR Parser: Validate and extract FHIR resources
  • JSON Ingester: Process file-based feeds
  • Bundle Updater: Atomically update patient bundles
  • Job Manager: Track async processing jobs

Guarantees:

  • Atomicity: Bundle updates are atomic (write temp, then rename)
  • Durability: All writes persisted to disk before ACK
  • Idempotency: Duplicate messages don't create duplicate data

4. Data Storage

Canonical Patient Bundles

Format: One JSON file per patient

Path: src/data/patients/<abhaId>/bundle.json

Schema:

{
  "patientId": "case_001",
  "abhaId": "ABHA-12345678901",
  "demographics": { ... },
  "cancer": { "site": "Breast", "stage": "IIB", ... },
  "labs": [ ... ],
  "imaging": [ ... ],
  "pathology": [ ... ],
  "genomics": [ ... ],
  "therapy": [ ... ],
  "medications": [ ... ],
  "documents": [ ... ],  // OCR outputs
  "transcripts": [ ... ], // ASR outputs
  "provenance": { "lastUpdated": "...", "sources": [...] }
}

Why JSON files:

  • Easy to inspect and debug
  • Version control friendly
  • No database schema migrations
  • Simple backup (file copy)

Future: May migrate to PostgreSQL + JSONB for better query performance at scale.


Processed Cache

Format: Single aggregated JSON

Path: src/data/processed_patients.json

Purpose:

  • Fast patient list queries
  • Precomputed summaries (test count, latest vitals)
  • Reduce Bundle I/O for list views

Regeneration:

  • Triggered after every bundle update
  • Incremental update (only changed patients)

Object Storage (S3/MinIO)

Stores:

  • Uploaded PDFs (OCR input)
  • Audio files (ASR input)
  • DICOM images (if downloaded from PACS)
  • Original HL7 messages (for audit/debugging)

Organization:

s3://entheory-hospital1/
  ├─ documents/
  │    └─ <patientId>/
  │         └─ doc_<timestamp>.pdf
  ├─ audio/
  │    └─ <patientId>/
  │         └─ audio_<timestamp>.mp3
  ├─ hl7/
  │    └─ <date>/
  │         └─ oru_<messageId>.txt
  └─ dicom/
       └─ <studyId>/
            └─ <seriesId>/
                 └─ <instanceId>.dcm


5. Message Queues

Technology: Kafka, NATS, or RabbitMQ (configurable per deployment)

Note: The messaging layer is abstracted to support multiple backends: - Kafka – Recommended for high-volume deployments (>10K messages/day) - NATS – Lightweight option for smaller installations - RabbitMQ – Traditional choice with rich routing features

Queues:

Queue Name Purpose Consumer
lab-ingestion-queue HL7 ORU lab messages Lab ingestion worker
imaging-ingestion-queue PACS JSON feeds Imaging worker
ocr-processing-queue Documents to OCR OCR worker (Tesseract)
asr-processing-queue Audio to transcribe ASR worker (Whisper)
hl7-dlq Failed HL7 messages Manual review
ocr-dlq Failed OCR jobs Manual review
asr-dlq Failed ASR jobs Manual review

Guarantees:

  • At-least-once delivery
  • Visibility timeout: 5 minutes (worker crashes, message requeued)
  • Dead letter queue for repeated failures (after 3 retries)

6. Processing Workers

OCR Worker

Technology: Tesseract 5.x

Process:

  1. Dequeue job from ocr-processing-queue
  2. Download PDF from S3
  3. Detect language (langdetect)
  4. Invoke Tesseract: tesseract input.pdf output -l eng|hin --oem 3 --psm 1
  5. Extract text and confidence
  6. Update patient bundle with extracted text
  7. Validate bundle, regenerate cache
  8. Mark job completed or failed

Parallelism: 10 workers, each processing 1 document at a time


ASR Worker

Technology: OpenAI Whisper (large-v3)

Process:

  1. Dequeue job from asr-processing-queue
  2. Download audio from S3
  3. Invoke Whisper: whisper audio.mp3 --model large-v3 --language en|hi
  4. Extract transcript with timestamps
  5. Update patient bundle
  6. Validate and cache
  7. Mark job completed/failed

Parallelization: GPU-accelerated, 2-3 concurrent jobs (memory bound)


7. External Integrations

HL7 v2 Listener

Technology: MLLP (Minimal Lower Layer Protocol) TCP listener

Port: 2575 (configurable)

Message Types:

  • ADT^A01 - Admission
  • ADT^A03 - Discharge
  • ORU^R01 - Lab results

Flow:

Hospital LIS → TCP/MLLP → HL7 Listener → Parse → Enqueue → ACK

Error Handling:

  • Malformed messages: Return NACK, log full message
  • Patient not found: Return ACK, move to DLQ for manual resolution

File Watchers

Technology: Chokidar (Node.js) or inotify (Linux)

Watched Directories:

/mnt/hospital-feeds/
  ├─ pacs/           # Imaging JSON files
  ├─ genomics/       # Genomics reports
  └─ pathology/      # Pathology PDFs

Debounce: 5 seconds (wait for file write completion)

Process:

  1. Detect new file
  2. Validate JSON schema (if JSON)
  3. Create ingestion job
  4. Enqueue
  5. Move processed file to /processed/<date>/

Data Flow Examples

Example 1: Lab Result Ingestion (HL7)

1. LIS sends HL7 ORU message via MLLP
   ↓

2. HL7 Listener receives, validates, sends ACK
   ↓

3. Parser extracts patient ID, test results
   ↓

4. Normalize to ABHA ID (query mapping table)
   ↓

5. Create job, enqueue to "lab-ingestion-queue"
   ↓

6. Worker dequeues, loads patient bundle
   ↓

7. Append lab results to bundle.labs[]
   ↓

8. Validate bundle (JSON schema)
   ↓

9. Atomically write updated bundle
   ↓

10. Regenerate processed_patients.json cache
   ↓

11. Generate FHIR Observation resources
   ↓

12. Job marked "completed"

Latency: 200-500ms end-to-end


Example 2: Document OCR (Hindi)

1. Physician uploads PDF via UI
   ↓

2. API validates file (size, type)
   ↓

3. API stores PDF in S3 with hash
   ↓

4. API creates OCR job, enqueues
   ↓

5. API returns 202 Accepted with jobId
   ↓
[Async Processing]

6. OCR worker dequeues job
   ↓

7. Worker downloads PDF from S3
   ↓

8. Worker detects language (Hindi)
   ↓

9. Worker runs Tesseract with "hin" pack
   ↓

10. Worker extracts text in Devanagari
   ↓

11. Worker calculates confidence (0.89)
   ↓

12. Worker updates bundle.documents[] with:
    - extractedText
    - language: "hi-IN"
    - confidence: 0.89
   ↓

13. Worker validates bundle
   ↓

14. Worker regenerates cache
   ↓

15. Job marked "completed"

Latency: 30-60 seconds for typical 2-3 page document


Scalability & Performance

Current Limits (Single Instance)

Resource Capacity
Patients 10,000
Concurrent Users 100 clinicians
API Throughput 1000 req/min
HL7 Messages 10,000/day
OCR Jobs 500/day

Scaling Strategies

Horizontal Scaling:

  • Deploy multiple API servers behind load balancer
  • Add more queue workers (OCR, ASR)
  • Shard file storage by hospital/patient ID range

Vertical Scaling:

  • Increase server RAM for in-memory cache
  • Add GPUs for faster ASR processing

Caching:

  • Redis for processed patient cache
  • CDN for static assets (frontend)
  • HTTP caching headers for patient API

Security Architecture

Authentication

  • JWT tokens (RS256)
  • Issued by hospital SSO/LDAP
  • Expiration: 8 hours
  • Refresh token flow

Authorization (RBAC)

  • Roles: Oncologist, Nurse, Data Manager, Admin, Read-Only
  • Permissions mapped per API endpoint
  • Row-level security: Physicians see only their department's patients (configurable)

Encryption

  • At Rest: AES-256 for bundles, S3 server-side encryption
  • In Transit: TLS 1.3 for al

l connections

  • Backups: Encrypted with hospital-provided keys

Audit Logging

  • Every patient data access logged
  • Log fields: userId, patientId, action, timestamp, IP
  • Immutable logs (append-only)
  • Retention: 7 years (compliance requirement)

Deployment Architecture

Option 1: On-Premises (Hospital Data Center)

┌─────────────────────────────────────────┐
│       Hospital Network (10.x.x.x)        │
│                                          │
│  ┌──────────────────────────────────┐  │
│  │  Entheory.AI VM/Container         │  │
│  │  • API Server                      │  │
│  │  • Workers (OCR/ASR)               │  │
│  │  • Kafka/NATS/RabbitMQ                        │  │
│  │  • File Storage (NFS/local disk)   │  │
│  └──────────────────────────────────┘  │
│               ↕                        │
│  ┌──────────────────────────────────┐  │
│  │  Hospital Systems                  │  │
│  │  • EMR (HL7 sender)                │  │
│  │  • LIS (Labs)                      │  │
│  │  • PACS (file drop)                │  │
│  └──────────────────────────────────┘  │
└─────────────────────────────────────────┘

Pros: Data never leaves hospital network, meets security policies
Cons: Hospital IT must maintain VM/containers


Option 2: Cloud (AWS/Azure) with VPN

┌──────────────────┐          ┌──────────────────────┐
│ Hospital Network │          │   Cloud VPC          │
│                  │          │                      │
│  EMR, LIS, PACS  │◄─────────┤  Entheory.AI App     │
│                  │ VPN/VPC  │  • API Servers       │
│                  │ Peering  │  • Workers (GPU)     │
│                  │          │  • S3, RDS           │
└──────────────────┘          └──────────────────────┘

Pros: Managed services, GPU for ASR, easier scaling
Cons: Requires VPN setup, data governance approval


Monitoring & Observability

Metrics (Prometheus + Grafana)

System Metrics:

  • CPU, Memory, Disk usage per service
  • API latency (p50, p95, p99)
  • Queue depth and processing lag
  • Error rates per endpoint

Business Metrics:

  • Patients ingested per day
  • Data completeness per modality
  • OCR/ASR accuracy trends
  • Clinician active users

Logging (ELK/Loki)

Structured JSON logs:

{
  "timestamp": "2024-12-03T10:15:30Z",
  "level": "INFO",
  "service": "ocr-worker",
  "jobId": "ocr_job_789",
  "event": "ocr_completed",
  "language": "hi-IN",
  "confidence": 0.89,
  "duration_ms": 34500
}

Alerting (PagerDuty/Slack)

Critical Alerts:

  • API downtime >1 minute
  • DLQ depth >10 messages
  • Disk usage >90%
  • FHIR validation failure rate >10%

Warning Alerts:

  • OCR confidence <0.70 (manual review needed)
  • Queue lag >15 minutes
  • Cache hit rate <80%

Disaster Recovery

Backup Strategy

  • Bundles: Daily automated backup to S3/Azure Blob (encrypted)
  • Object Storage: Native S3 versioning enabled
  • Audit Logs: Replicated to separate region

Recovery Procedures

  • Data Loss: Restore from last night's backup (RPO: 24 hours)
  • Server Failure: Redeploy from Docker image, mount backup storage (RTO: 2 hours)

Testing

  • Quarterly disaster recovery drills
  • Automated restore tests monthly

Document Owner: Tech Lead / Architect
Last Updated: 2024-12-09
Related: Data Model | APIs & Interoperability | Security & Privacy | DevOps & SRE