Skip to content

Cloud Infrastructure Use Cases (CLD)

See Also: DevOps & SRE for CI/CD pipelines and monitoring | Security & Privacy for compliance controls

This section covers cloud infrastructure provisioning, hardening, and operations for AWS, GCP, and Azure deployments.


Infrastructure Provisioning (CLD-001 to CLD-010)

UC-INF-001: Cloud Account Provisioning

Purpose: Create isolated network infrastructure with proper subnet segmentation for healthcare workloads.

Property Value
Actor DevOps Engineer
Trigger New hospital onboarding / Environment setup
Priority P0

Main Success Scenario:

1. Define VPC CIDR blocks (e.g., 10.0.0.0/16)
2. Create public subnet (NAT Gateway, bastion)
3. Create private subnets (app servers, workers)
4. Create data subnet (RDS, ElastiCache)
5. Configure route tables and NAT gateways
6. Apply network ACLs for inter-subnet traffic
7. Enable VPC Flow Logs to CloudWatch/Stackdriver

Acceptance Criteria:

  1. [ ] No direct internet access from private/data subnets
  2. [ ] Flow logs capture all traffic for security audit
  3. [ ] Subnets tagged with environment and purpose

UC-INF-002: Network Segmentation

Purpose: Deploy managed Kubernetes cluster with security hardening.

Property Value
Actor DevOps Engineer
Trigger Platform deployment
Priority P0

Main Success Scenario:

1. Create cluster with private endpoint only
2. Configure node pools (app, GPU workers, system)
3. Enable control plane logging
4. Install CNI with NetworkPolicy support (Calico/Cilium)
5. Deploy cert-manager for TLS
6. Configure cluster autoscaler
7. Apply Pod Security Standards (Restricted)

Acceptance Criteria:

  1. [ ] Cluster endpoint not publicly accessible
  2. [ ] Node pools use encrypted EBS/Persistent Disks
  3. [ ] NetworkPolicies enforced between namespaces
  4. [ ] Pod Security Standards prevent privileged containers

UC-INF-003: IAM Hardening

Purpose: Deploy managed PostgreSQL with encryption, backups, and high availability.

Property Value
Actor DevOps Engineer
Trigger Environment provisioning
Priority P0

Main Success Scenario:

1. Create RDS/Cloud SQL instance in private subnet
2. Enable encryption at rest (KMS/Cloud KMS)
3. Configure automated backups (7-day retention)
4. Enable Point-in-Time Recovery
5. Create read replica for HA (Production only)
6. Configure parameter groups (ssl=require, audit logging)
7. Create IAM authentication roles

Acceptance Criteria:

  1. [ ] Database not publicly accessible
  2. [ ] All connections require SSL
  3. [ ] Automated backups verified restorable
  4. [ ] Audit logging enabled for security events

UC-INF-004: Secrets Management

Purpose: Create S3/GCS buckets for document, audio, and DICOM storage with proper security.

Property Value
Actor DevOps Engineer
Trigger Environment provisioning
Priority P0

Main Success Scenario:

1. Create bucket with unique name per environment
2. Enable default encryption (SSE-S3, SSE-KMS)
3. Block all public access
4. Enable versioning
5. Configure lifecycle rules (archive to Glacier/Coldline after 1 year)
6. Enable access logging to audit bucket
7. Apply bucket policy restricting to VPC endpoint only

Acceptance Criteria:

  1. [ ] Public access blocked at bucket level
  2. [ ] All objects encrypted at rest
  3. [ ] Cross-region replication enabled for DR
  4. [ ] Lifecycle transitions verified working

UC-INF-005: Audit Logging

Purpose: Deploy managed Kafka, NATS, or RabbitMQ cluster for event streaming.

Property Value
Actor DevOps Engineer
Trigger Environment provisioning
Priority P1

Main Success Scenario:

1. Create cluster in private subnets
2. Enable encryption in transit (TLS)
3. Configure authentication (SASL/IAM)
4. Set retention policies per topic
5. Configure replication factor (3 for production)
6. Enable monitoring and alerting
7. Create dead letter topics for failed messages

Acceptance Criteria:

  1. [ ] No plaintext connections allowed
  2. [ ] Authentication required for all producers/consumers
  3. [ ] Metrics exported to Prometheus/CloudWatch

UC-INF-006: Infra Remaining Controls

Purpose: Deploy managed Redis for session cache and API caching.

Property Value
Actor DevOps Engineer
Trigger Environment provisioning
Priority P1

Main Success Scenario:

1. Create cluster in private subnet
2. Enable encryption at rest and in transit
3. Configure AUTH password
4. Set maxmemory-policy to allkeys-lru
5. Configure backup retention
6. Enable slow log for performance debugging

Acceptance Criteria:

  1. [ ] Redis not accessible from public internet
  2. [ ] All connections encrypted (TLS)
  3. [ ] Authentication required

UC-INF-007: Configure Load Balancer with WAF

Purpose: Deploy Application Load Balancer with Web Application Firewall protection.

Property Value
Actor DevOps Engineer
Trigger Environment provisioning
Priority P0

Main Success Scenario:

1. Create ALB/Cloud Load Balancer
2. Configure HTTPS listener with TLS 1.3
3. Attach WAF with managed rule groups:
   - AWS Managed Common Rules
   - SQL Injection protection
   - Known Bad Inputs
4. Enable access logging
5. Configure health checks
6. Set up rate limiting (1000 req/IP/min)

Acceptance Criteria:

  1. [ ] HTTP redirected to HTTPS
  2. [ ] WAF blocks OWASP Top 10 attacks
  3. [ ] Rate limiting prevents DDoS

UC-INF-008: Provision Secrets Manager/Vault

Purpose: Deploy centralized secrets management for credentials and API keys.

Property Value
Actor DevOps Engineer
Trigger Environment provisioning
Priority P0

Main Success Scenario:

1. Create Secrets Manager/Vault instance
2. Configure auto-rotation for database credentials
3. Create secrets for:
   - Database passwords
   - API keys (external services)
   - JWT signing keys
   - Encryption keys
4. Configure IAM policies for secret access
5. Enable audit logging for all access

Acceptance Criteria:

  1. [ ] No secrets in environment variables or code
  2. [ ] Auto-rotation working for DB credentials
  3. [ ] Audit trail for all secret access

UC-INF-009: Configure Monitoring and Alerting Stack

Purpose: Deploy comprehensive monitoring with Prometheus, Grafana, and alerting.

Property Value
Actor DevOps Engineer
Trigger Environment provisioning
Priority P1

Main Success Scenario:

1. Deploy Prometheus server with persistent storage
2. Configure service discovery for Kubernetes
3. Deploy Grafana with SSO integration
4. Import dashboards (node, pod, application)
5. Configure AlertManager with PagerDuty/Slack
6. Set up alerting rules per SLO
7. Enable log aggregation (Loki/CloudWatch)

Acceptance Criteria:

  1. [ ] Metrics retained for 30 days
  2. [ ] Critical alerts reach on-call within 5 minutes
  3. [ ] Dashboards cover all SLO metrics

UC-INF-010: Provision GPU Nodes for ML Workloads

Purpose: Configure GPU-enabled nodes for OCR, ASR, and NLP processing.

Property Value
Actor DevOps Engineer
Trigger ML workload deployment
Priority P1

Main Success Scenario:

1. Create GPU node pool (T4/A10G/V100)
2. Install NVIDIA device plugin
3. Configure resource limits per GPU
4. Set up node taints for GPU workloads
5. Configure spot/preemptible instances for cost optimization
6. Set up autoscaling based on queue depth

Acceptance Criteria:

  1. [ ] GPU resources visible to Kubernetes scheduler
  2. [ ] Autoscaling triggers when ASR queue > 50 jobs
  3. [ ] Cost optimization with spot instances (non-critical workloads)

Security Hardening (CLD-020 to CLD-035)

UC-INF-020: Enable Cloud Audit Logging

Purpose: Configure comprehensive audit logging for compliance and security monitoring.

Property Value
Actor Security Engineer
Trigger Environment setup
Priority P0

Main Success Scenario:

1. Enable CloudTrail/Cloud Audit Logs for all regions
2. Configure organization-level trail
3. Send logs to centralized S3 bucket/GCS
4. Enable log file integrity validation
5. Configure log retention (7 years for DPDP compliance)
6. Set up alerts for high-risk events

Acceptance Criteria:

  1. [ ] All API calls logged
  2. [ ] Logs tamper-evident (hash validation)
  3. [ ] Retention meets compliance requirements

UC-INF-021: Configure IAM with Least Privilege

Purpose: Implement IAM policies following least privilege principle.

Property Value
Actor Security Engineer
Trigger User/service onboarding
Priority P0

Main Success Scenario:

1. Create IAM groups per role (Admin, Developer, ReadOnly)
2. Define service-specific policies (S3, RDS, EKS)
3. Use IAM Roles for Services (no long-lived credentials)
4. Enable MFA for all human users
5. Configure password policy (14 chars, complexity)
6. Set up access analyzer for unused permissions
7. Review and remediate quarterly

Acceptance Criteria:

  1. [ ] No root account usage (except break-glass)
  2. [ ] All service accounts use IRSA/Workload Identity
  3. [ ] IAM Access Analyzer shows no public resources

UC-INF-022: Enable Security Hub/Security Command Center

Purpose: Centralized security posture monitoring and compliance checks.

Property Value
Actor Security Engineer
Trigger Environment provisioning
Priority P1

Main Success Scenario:

1. Enable Security Hub/SCC
2. Enable AWS Foundational Best Practices
3. Enable CIS Benchmarks
4. Configure automatic remediation for critical findings
5. Integrate with Slack/PagerDuty for alerts
6. Schedule weekly compliance reports

Acceptance Criteria:

  1. [ ] Critical findings addressed within 24 hours
  2. [ ] Compliance score > 90%
  3. [ ] Weekly reports sent to security team

UC-INF-023: Configure VPC Endpoints for Private Access

Purpose: Eliminate internet exposure for AWS service access.

Property Value
Actor DevOps Engineer
Trigger Environment provisioning
Priority P0

Main Success Scenario:

1. Create VPC endpoints for:
   - S3 (Gateway endpoint)
   - RDS/Secrets Manager (Interface endpoints)
   - ECR (Interface endpoint)
   - SQS (Interface endpoint)
2. Configure endpoint policies to restrict access
3. Update route tables and security groups
4. Verify no traffic goes over public internet

Acceptance Criteria:

  1. [ ] All AWS API calls use VPC endpoints
  2. [ ] No NAT Gateway charges for AWS services
  3. [ ] Endpoint policies restrict to necessary actions

UC-INF-024: Enable GuardDuty/Security Scanner

Purpose: Threat detection for malicious activity and compromised instances.

Property Value
Actor Security Engineer
Trigger Environment setup
Priority P0

Main Success Scenario:

1. Enable GuardDuty/SCC threat detection
2. Configure S3 and Kubernetes protection
3. Set up automated response (Lambda/Cloud Functions):
   - Block compromised IPs
   - Isolate compromised instances
4. Integrate with SIEM
5. Configure severity-based alerting

Acceptance Criteria:

  1. [ ] High severity findings trigger PagerDuty
  2. [ ] Automated containment for critical threats
  3. [ ] Monthly threat report reviewed

UC-INF-025: Configure Encryption Key Management

Purpose: Centralized KMS configuration for all encryption operations.

Property Value
Actor Security Engineer
Trigger Environment provisioning
Priority P0

Main Success Scenario:

1. Create customer-managed KMS keys:
   - rds-encryption-key
   - s3-encryption-key
   - backup-encryption-key
2. Configure key rotation (annual)
3. Set key policies with least privilege
4. Enable key usage logging
5. Document key backup/recovery procedure

Acceptance Criteria:

  1. [ ] All sensitive data uses customer-managed keys
  2. [ ] Key rotation enabled and tested
  3. [ ] Key deletion requires multi-party approval

UC-INF-026: Implement Network Firewall Rules

Purpose: Define and enforce network-level access controls.

Property Value
Actor Security Engineer
Trigger Environment provisioning
Priority P0

Main Success Scenario:

1. Configure Security Groups:
   - App: Allow 443 from ALB only
   - Workers: Allow internal cluster only
   - Database: Allow 5432 from app subnet only
2. Configure Network ACLs as backup
3. Enable AWS Network Firewall/Cloud Armor
4. Block known-malicious IPs
5. Enable DDoS protection (Shield/Cloud Armor)

Acceptance Criteria:

  1. [ ] Database not accessible from internet
  2. [ ] All inbound explicitly allowed (no 0.0.0.0/0)
  3. [ ] DDoS mitigation enabled

UC-INF-027: Configure Container Image Security

Purpose: Ensure only trusted, scanned images run in production.

Property Value
Actor DevOps Engineer
Trigger CI/CD pipeline
Priority P0

Main Success Scenario:

1. Configure ECR/GCR with image scanning
2. Enable scan-on-push
3. Block images with critical CVEs from deployment
4. Sign images with Cosign/Notation
5. Configure admission controller to verify signatures
6. Use distroless/minimal base images

Acceptance Criteria:

  1. [ ] No critical CVE images in production
  2. [ ] All images signed and verified
  3. [ ] Base images updated monthly

UC-INF-028: Enable Data Loss Prevention (DLP)

Purpose: Prevent unauthorized data exfiltration of PHI.

Property Value
Actor Security Engineer
Trigger Environment setup
Priority P1

Main Success Scenario:

1. Configure Macie/DLP API for S3 scanning
2. Define PHI data identifiers (ABHA, phone, addresses)
3. Set up alerts for PHI in unexpected locations
4. Block PHI in logs via log scrubbing
5. Configure egress filtering for data exfiltration

Acceptance Criteria:

  1. [ ] PHI scans run weekly
  2. [ ] Alerts for PHI in non-PHI buckets
  3. [ ] No PHI in application logs

UC-INF-029: Configure Backup and Disaster Recovery

Purpose: Ensure data recoverability with tested backup procedures.

Property Value
Actor DevOps Engineer
Trigger Environment provisioning
Priority P0

Main Success Scenario:

1. Configure automated backups:
   - RDS: Daily snapshots, 30-day retention
   - S3: Cross-region replication
   - EBS: Daily snapshots
2. Enable AWS Backup/Cloud Backup
3. Configure backup vault with immutable retention
4. Document and test restore procedures
5. Schedule quarterly DR drills

Acceptance Criteria:

  1. [ ] RPO < 24 hours for all data
  2. [ ] Restore tested successfully quarterly
  3. [ ] Backups encrypted and immutable

UC-INF-030: Configure Cost Monitoring and Alerts

Purpose: Track cloud spending and prevent cost overruns.

Property Value
Actor FinOps Engineer
Trigger Environment provisioning
Priority P2

Main Success Scenario:

1. Enable Cost Explorer/Billing Reports
2. Set up budget alerts (80%, 100%, 120%)
3. Tag all resources by environment, service, owner
4. Configure rightsizing recommendations
5. Enable spot instance usage reports
6. Schedule monthly cost reviews

Acceptance Criteria:

  1. [ ] All resources tagged consistently
  2. [ ] Alerts trigger before budget exceeded
  3. [ ] Monthly cost trend visible in dashboard

Operations & Compliance (CLD-040 to CLD-050)

UC-INF-040: Implement Infrastructure as Code (Terraform)

Purpose: Manage all infrastructure via versioned, auditable code.

Property Value
Actor DevOps Engineer
Trigger Environment provisioning
Priority P0

Main Success Scenario:

1. Define Terraform modules per component
2. Use remote state with locking (S3+DynamoDB)
3. Implement CI/CD for infra changes (plan → review → apply)
4. Enable drift detection
5. Document module usage in README
6. Use Terraform Cloud/Atlantis for collaboration

Acceptance Criteria:

  1. [ ] No manual console changes in production
  2. [ ] All changes peer-reviewed
  3. [ ] State file encrypted and backed up

UC-INF-041: Configure Compliance Scanning

Purpose: Continuous compliance monitoring against CIS, HIPAA, DPDP.

Property Value
Actor Security Engineer
Trigger Continuous
Priority P1

Main Success Scenario:

1. Enable Prowler/ScoutSuite scans
2. Configure custom rules for DPDP compliance
3. Schedule daily scans
4. Integrate findings with Jira/ServiceNow
5. Generate monthly compliance reports
6. Track remediation SLAs

Acceptance Criteria:

  1. [ ] Critical findings remediated within 48 hours
  2. [ ] Compliance score visible in dashboard
  3. [ ] Monthly reports for leadership

UC-INF-042: Configure Log Retention for Compliance

Purpose: Ensure logs retained per regulatory requirements.

Property Value
Actor Security Engineer
Trigger Environment setup
Priority P0

Main Success Scenario:

1. Configure CloudWatch/Stackdriver log groups
2. Set retention: 
   - Audit logs: 7 years (DPDP)
   - Application logs: 90 days
   - Debug logs: 14 days
3. Archive to S3 Glacier/Coldline
4. Enable log integrity verification
5. Document log access procedures

Acceptance Criteria:

  1. [ ] Audit logs retained 7 years
  2. [ ] Logs exportable within 24 hours for regulators
  3. [ ] Archived logs retrievable within 48 hours

UC-INF-043: Configure Multi-Region DR

Purpose: Enable failover to secondary region for disaster recovery.

Property Value
Actor DevOps Engineer
Trigger DR planning
Priority P1

Main Success Scenario:

1. Replicate S3 buckets cross-region
2. Configure RDS cross-region read replica
3. Deploy standby EKS cluster in DR region
4. Configure Route 53/Cloud DNS for failover
5. Document and test failover procedures
6. Conduct annual DR drill

Acceptance Criteria:

  1. [ ] RTO < 4 hours for full region failover
  2. [ ] RPO < 1 hour for critical data
  3. [ ] DR drill completed successfully annually

UC-INF-044: Enable Service Mesh (Istio/Linkerd)

Purpose: Implement zero-trust networking between microservices.

Property Value
Actor DevOps Engineer
Trigger Security hardening
Priority P2

Main Success Scenario:

1. Deploy Istio/Linkerd control plane
2. Enable automatic mTLS between services
3. Configure authorization policies
4. Enable distributed tracing
5. Configure traffic management (retries, timeouts)
6. Enable observability (Kiali, Jaeger)

Acceptance Criteria:

  1. [ ] All inter-service traffic encrypted (mTLS)
  2. [ ] Authorization policies enforced
  3. [ ] Service topology visible in dashboard

UC-INF-045: Configure Auto-Patching for Instances

Purpose: Ensure all instances receive security patches automatically.

Property Value
Actor DevOps Engineer
Trigger Continuous
Priority P0

Main Success Scenario:

1. Configure SSM Patch Manager/OS Config
2. Define patch baselines (critical within 48h)
3. Schedule maintenance windows (weekly, non-business hours)
4. Enable pre-patch snapshots
5. Monitor patch compliance
6. Remediate non-compliant instances

Acceptance Criteria:

  1. [ ] 100% patch compliance for critical CVEs
  2. [ ] Patching doesn't cause service disruption
  3. [ ] Rollback available via snapshots

Summary

Category Use Cases Count
Infrastructure Provisioning CLD-001 to CLD-010 10
Security Hardening CLD-020 to CLD-030 11
Operations & Compliance CLD-040 to CLD-045 6
Total 27

Related: