Infrastructure Overview
The DailyDesk infrastructure is defined using AWS CDK (TypeScript) and follows a nested stack architecture. The CDK deployment is orchestrated by AWS CloudFormation.
Stack Hierarchy
The entry point for the infrastructure is the DailyDeskStack. This "parent" stack acts as an orchestrator, defining shared resources (like SSM parameters) and instantiating the environment-specific nested stacks.
1. Parent Stack (DailyDeskStack)
- Role: Orchestrator.
- Key Responsibility:
- Configuration: Initializes
SSMParametersconstruct to standardize parameter lookup. - Structure: Instantiates the nested
CdkBackendStackandCdkFrontendStack.
- Configuration: Initializes
2. Backend Stack (CdkBackendStack)
- Role: Full-stack definition for the Backend API.
- Responsibilities:
- Hosting: Provisions ECS Fargate (Compute), ALB (Networking), and Redis (Cache).
- Scheduling: Provisions EventBridge Rules for cron jobs (SMS, Queue Reset).
- Deployment: Defines the
ecs-pipelineto build and deploy Docker images. - External Data: Configures connection to the external RDS instance.
3. Frontend Stack (CdkFrontendStack)
- Role: Deployment pipeline for the Frontend Web Apps.
- Responsibilities:
- Deployment: Defines the
daily-desk-frontend-ci-pipelineto build React apps and sync artifacts. - External Hosting: References (but does not provision) external S3 Buckets and CloudFront Distributions.
- Deployment: Defines the
Infrastructure Architecture
graph TD
subgraph AWS["AWS Cloud (us-east-1)"]
subgraph Config["Configuration & Security"]
SSM["SSM Parameter Store"]
Secrets["Secrets Manager"]
ACM["ACM Certificate (*.dailydesk.com)"]
end
subgraph Frontend["Frontend Layer"]
CF["CloudFront Distributions"]
S3["S3 Buckets (Web/Tablet/Booking)"]
CF -->|Origin| S3
CF -.->|HTTPS| ACM
end
subgraph Backend["Backend Layer"]
ALB["Application Load Balancer"]
ECS["Fargate Service (NestJS)"]
Redis["ElastiCache Redis"]
EB["EventBridge Scheduler"]
ALB -->|Forward| ECS
ECS -->|Cache/Queue| Redis
EB -->|Trigger REST API| ECS
ECS -.->|Fetch Config| SSM
ECS -.->|Fetch Secrets| Secrets
end
subgraph Data["Data Persistence"]
RDS[("RDS PostgreSQL")]
end
end
subgraph External["External Integrations"]
GSM["GSM Gateway (SMS)"]
end
subgraph CI_CD["CI/CD Pipelines"]
BE_PL["Backend Pipeline"] -->|Deploy| ECS
BE_PL -->|Run Migrations| RDS
FE_PL["Frontend Pipeline"] -->|Sync Assets| S3
FE_PL -->|Invalidate Cache| CF
end
ECS -->|Read/Write| RDS
ECS -->|Send SMS| GSM
Frontend -->|API Requests| ALB
Deployment & Execution
When you run cdk deploy, the following process occurs to provision resources on AWS:
- Synthesize (
cdk synth):- The CDK app compiles your TypeScript code into a CloudFormation Template (JSON/YAML).
- It generates a template for the parent stack and separate templates for each nested stack.
- Upload Assets:
- The templates and any local assets (like Lambda code or Docker build contexts) are uploaded to the "CDK Toolkit" S3 bucket.
- Create Change Set:
- CloudFormation compares the uploaded template against the currently deployed stack.
- It creates a Change Set listing what will be added, modified, or deleted.
- Execute:
- CloudFormation applies the changes in the correct dependency order.
Safety & Rollbacks
CloudFormation operations are atomic and safe by design.
Automatic Rollback
If any resource fails to provision (e.g., an IAM permission error, invalid configuration, or timeout) during a deployment:
- Stop: CloudFormation immediately stops further changes.
- Rollback: It enters the
UPDATE_ROLLBACK_IN_PROGRESSstate. - Revert: It attempts to revert all resources in the stack to their previous working state.
- Result:
- Success: The stack returns to
UPDATE_ROLLBACK_COMPLETE. The environment is exactly as it was before the failed deployment. - Failure: If the rollback fails (rare, usually due to manual interference), the stack enters
UPDATE_ROLLBACK_FAILEDand requires manual intervention ("Continue Update Rollback").
- Success: The stack returns to
Troubleshooting a Rollback
- Go to the CloudFormation Console.
- Select the failed stack.
- Click the Events tab.
- Look for the first event with status
CREATE_FAILEDorUPDATE_FAILED. The "Status reason" usually explains the exact error (e.g., "Resource handler returned message: 'User: ... is not authorized to perform: ...'").
Hanging Deployments
sometimes infrastructure deployments may appear to "hang" indefinitely (e.g., waiting for a resource to stabilize or a hidden dependency issue).
Solution:
1. Go to the CloudFormation Console.
2. Select the stack that is in UPDATE_IN_PROGRESS.
3. Click Stack Actions -> Cancel update.
4. Wait for the stack to rollback to its previous state (UPDATE_ROLLBACK_COMPLETE).
5. Retry the deployment after investigating the root cause.