Skip to content

Operations Guide

This guide explains every tool, service, and process involved in running Softly. It is written for someone with zero DevOps experience. If you can follow a recipe, you can follow this guide.

1. Local Development Tools

These are the tools installed on your Mac. Each one has a specific job.

Homebrew

What it is: A package manager for macOS. Think of it as an app store, but for developer tools. Instead of downloading installers from random websites, you type one command and Homebrew handles the rest.

Where it lives: /opt/homebrew/ (on Apple Silicon Macs)

Verify it works:

bash
brew --version

You should see something like Homebrew 4.x.x.


Ruby 3.4.2 (via rbenv)

What it is: The programming language the backend API is written in. Rails (the web framework) runs on Ruby.

Why rbenv: Different projects might need different Ruby versions. rbenv lets you install multiple versions and switch between them. Without it, you would have to uninstall and reinstall Ruby every time you switched projects.

Where it lives:

  • rbenv itself: ~/.rbenv/
  • Ruby versions: ~/.rbenv/versions/
  • The version file: backend/.ruby-version tells rbenv which version to use

Verify it works:

bash
ruby --version
# Should show: ruby 3.4.2

rbenv versions
# Should list 3.4.2 with an asterisk next to the active one

Node.js 20

What it is: A JavaScript runtime. The mobile app (Expo/React Native), the landing page (Astro), the docs site (VitePress), and the web app (Next.js) all need Node.js to build and run.

Where it lives: Installed via Homebrew, typically at /opt/homebrew/bin/node

Verify it works:

bash
node --version
# Should show: v20.x.x

npm --version
# Should show: 10.x.x

PostgreSQL

What it is: A relational database. In development, it runs on your Mac so you can test the backend locally without connecting to the production database on AWS.

Where it lives: Data stored at /opt/homebrew/var/postgresql@16/ (or similar)

Configuration: The backend uses two local databases defined in backend/config/database.yml:

  • backend_development for local development
  • backend_test for running automated tests

Verify it works:

bash
pg_isready
# Should show: accepting connections

psql -l
# Should list your local databases

Docker Desktop

What it is: Software that runs "containers" on your Mac. A container is a self-contained package that includes everything an application needs to run: the code, the language runtime, libraries, and system tools. When we deploy the backend to AWS, we package it as a Docker container so it runs identically everywhere.

Why we need it locally: To build the Docker image before pushing it to AWS. The image you build on your Mac is the exact same thing that runs in production.

Where it lives: /Applications/Docker.app, daemon runs in the background

Verify it works:

bash
docker --version
# Should show: Docker version 2x.x.x

docker info
# Should show details about the Docker engine (not an error about daemon not running)

AWS CLI

What it is: A command-line tool that talks to Amazon Web Services. Instead of clicking around in the AWS web console, you type commands. This is how we push Docker images, update ECS services, check logs, and manage secrets.

Where it is configured: ~/.aws/credentials

This file contains your AWS access key and secret key. These are like a username and password for the AWS API. Our AWS account ID is 239732221658.

Verify it works:

bash
aws sts get-caller-identity
# Should show your account ID (239732221658), user ARN, and user ID

If this fails, run aws configure and enter the access key, secret key, region (eu-west-2), and output format (json).


Terraform

What it is: Infrastructure as code. Instead of logging into the AWS console and clicking "Create Database" or "Create Load Balancer," you write .tf files that describe what you want. Then you run terraform apply and Terraform creates everything for you.

Why this matters: If you click buttons in a web console, there is no record of what you did. If something breaks, you cannot easily recreate it. With Terraform, the entire infrastructure is described in files that live in version control. You can see exactly what changed and when.

Where it lives:

  • Terraform files: /Users/rajg/Codes/ai-softly/infrastructure/
  • State file: stored remotely in S3 bucket softly-terraform-state (so multiple people can work on infrastructure without conflicts)
  • Lock table: DynamoDB table softly-terraform-locks (prevents two people from making changes at the same time)

Verify it works:

bash
cd /Users/rajg/Codes/ai-softly/infrastructure
terraform version
# Should show: Terraform v1.x.x

terraform plan
# Should show what changes would be made (if any)

Wrangler

What it is: Cloudflare's CLI tool. It deploys static sites (HTML/CSS/JS) to Cloudflare Pages. We use it for the landing page and documentation site.

Where it is configured: ~/.wrangler/ contains OAuth tokens from when you logged in

Verify it works:

bash
wrangler whoami
# Should show your Cloudflare account name and ID

If it fails, run wrangler login and authenticate via the browser.


Vercel CLI

What it is: Vercel's command-line tool. Vercel hosts the Next.js web app (app.mysoftly.app). The CLI lets you deploy manually and manage domains.

Where it is configured: Authentication is handled via browser OAuth. No config files to manage manually.

Verify it works:

bash
vercel whoami
# Should show your Vercel username

If it fails, run vercel login and authenticate via the browser.


EAS CLI (Expo Application Services)

What it is: The tool that builds the React Native mobile app for iOS and Android. When you want to submit the app to the App Store or Google Play, EAS handles the build process in the cloud (so you do not need Xcode or Android Studio to produce release builds).

Verify it works:

bash
eas whoami
# Should show your Expo account username

Git + GitHub CLI (gh)

What it is: Git tracks every change to every file in the codebase. GitHub hosts the repository online. The gh CLI lets you create pull requests, view issues, and manage the repo from the terminal.

Where it is configured: Git credentials stored in the macOS keychain

Verify it works:

bash
git --version
gh auth status
# Should show: Logged in to github.com as rajgurung

2. The Domain Journey

This section tells the full story of how typing mysoftly.app in a browser gets you to actual content.

Where the Domain Lives

The domain mysoftly.app is registered at GoDaddy. We own it. But GoDaddy only handles the registration (proving we own the name). We changed the nameservers from GoDaddy's defaults to Cloudflare's:

  • heather.ns.cloudflare.com
  • phil.ns.cloudflare.com

This means Cloudflare now controls ALL DNS for mysoftly.app. GoDaddy just points to Cloudflare and does nothing else.

What Cloudflare Does

Cloudflare sits between the internet and our servers. It provides:

  • DNS management: Controls which subdomain points where (like a phonebook that maps names to addresses)
  • Free SSL certificates: Encrypts traffic with HTTPS so nobody can snoop on data in transit
  • CDN (Content Delivery Network): Caches content on servers around the world so users get fast load times regardless of location
  • DDoS protection: Blocks floods of fake traffic designed to take a site offline
  • Cloudflare Pages: Hosts static sites directly on Cloudflare's network (no separate server needed)

The DNS Records

Here is every DNS record we have set up and why:

RecordTypePoints ToProxied?Purpose
mysoftly.appCNAMEsoftly-landing.pages.devYes (orange cloud)Landing page hosted on Cloudflare Pages
app.mysoftly.appA76.76.21.21No (grey cloud)Web app hosted on Vercel. DNS only because Vercel handles its own SSL
api.mysoftly.appCNAMEsoftly-production-1592142071.eu-west-2.elb.amazonaws.comNo (grey cloud)Rails API on AWS. DNS only because the ALB does not have SSL yet
docs.mysoftly.appCNAMEsoftly-docs.pages.devYes (orange cloud)Documentation hosted on Cloudflare Pages
wwwCNAMEmysoftly.appYes (orange cloud)Redirects www.mysoftly.app to mysoftly.app
_amazonsesTXTverification tokenDNS onlyProves to AWS that we own the domain (required for sending email via SES)
3x _domainkeyCNAMEDKIM tokensDNS onlyEmail authentication. Recipients can verify emails really came from us
_dmarcTXTDMARC policyDNS onlyTells email providers what to do with emails that fail authentication checks

Proxied vs DNS Only

This is one of the most confusing parts of Cloudflare, so here is a clear explanation:

Proxied (orange cloud icon): When a record is proxied, traffic goes through Cloudflare first. Cloudflare adds SSL encryption, caching, and DDoS protection, then forwards the request to the actual server. The real server's IP address is hidden from the public. This is the default and recommended setting.

DNS only (grey cloud icon): When a record is DNS only, Cloudflare simply tells the browser where to go. Traffic goes directly to the destination server. Cloudflare does not add any protection or caching. We use this when the destination already handles its own SSL (like Vercel) or when the destination does not support being behind a proxy.


3. How Each Site Gets Deployed

Landing Page (mysoftly.app) -- Cloudflare Pages

The landing page is a static site built with Astro.

  1. Code lives in the rajgurung/softly-landing GitHub repository
  2. Run npm run build to generate static HTML/CSS/JS files in the dist/ folder
  3. Run wrangler pages deploy dist --project-name=softly-landing to upload those files to Cloudflare
  4. Cloudflare serves the site globally on softly-landing.pages.dev
  5. The custom domain mysoftly.app has a CNAME record pointing to softly-landing.pages.dev
  6. Cloudflare adds free SSL automatically since the record is proxied

Full deploy command:

bash
cd landing && npm run build && wrangler pages deploy dist --project-name=softly-landing --commit-dirty=true --branch=main

Documentation (docs.mysoftly.app) -- Cloudflare Pages

The docs site is built with VitePress (a static site generator for documentation).

  1. Code lives in the docs/ folder of the main ai-softly repository
  2. Run npm run build to generate static files in .vitepress/dist/
  3. Run wrangler pages deploy .vitepress/dist --project-name=softly-docs to upload to Cloudflare
  4. Cloudflare serves it on softly-docs.pages.dev
  5. Custom domain docs.mysoftly.app CNAME points to softly-docs.pages.dev

Full deploy command:

bash
cd docs && npm run build && wrangler pages deploy .vitepress/dist --project-name=softly-docs --commit-dirty=true --branch=main

Web App (app.mysoftly.app) -- Vercel

The web app is built with Next.js.

  1. Code lives in the rajgurung/softly-web GitHub repository
  2. Connected to GitHub for automatic deployment. Push to main and Vercel builds and deploys automatically
  3. Vercel builds the Next.js app and runs it with serverless functions
  4. Custom domain added via vercel domains add app.mysoftly.app
  5. DNS: an A record pointing to Vercel's IP 76.76.21.21 (DNS only, because Vercel handles its own SSL)

API (api.mysoftly.app) -- AWS ECS

The API is a Ruby on Rails application.

  1. Code lives in the rajgurung/ai-softly repository under backend/
  2. Build a Docker image: docker build --platform linux/amd64 -t ECR_URL:latest .
    • --platform linux/amd64 is important because AWS runs on Intel/AMD processors, but your Mac might use Apple Silicon (ARM). This flag ensures the image is built for the right architecture
  3. Push the image to ECR (Amazon's Docker registry): docker push ECR_URL:latest
  4. Tell ECS to deploy: aws ecs update-service --force-new-deployment
  5. ECS pulls the new image from ECR and starts a new container
  6. The ALB (load balancer) detects the new container is healthy and routes traffic to it
  7. DNS: CNAME record pointing to the ALB address

Full deploy commands:

bash
cd backend
docker build --platform linux/amd64 -t 239732221658.dkr.ecr.eu-west-2.amazonaws.com/softly-app:latest .
aws ecr get-login-password --region eu-west-2 | docker login --username AWS --password-stdin 239732221658.dkr.ecr.eu-west-2.amazonaws.com
docker push 239732221658.dkr.ecr.eu-west-2.amazonaws.com/softly-app:latest
aws ecs update-service --cluster softly-production --service softly-production --force-new-deployment --region eu-west-2

Note: The API currently runs on HTTP only. HTTPS via ACM certificate is a pending task (ticket: mobile-1om).


4. AWS Architecture Deep Dive

All AWS resources live in the eu-west-2 region (London). Here is what each service does, what it costs, and how to check on it.

ECS (Elastic Container Service) -- Our App Server

What it is in plain English: ECS runs Docker containers in the cloud. Think of it as a hosting service that takes a packaged-up version of your app and runs it for you. We use "Fargate" mode, which means AWS manages the underlying servers. We never have to worry about installing operating system updates, replacing hard drives, or any hardware at all.

Our setup:

  • Cluster name: softly-production
  • Service name: softly-production
  • Each task gets: 512 CPU units (half a virtual CPU) and 1024 MiB memory (1 GB RAM)
  • Auto-scaling: minimum 1 container, maximum 3. Scales up when CPU exceeds 70%. Currently always runs 1
  • Container port: 3000 (Rails default)

How to check its status:

Monthly cost: ~$10-15


ECR (Elastic Container Registry) -- Docker Image Storage

What it is in plain English: A private storage locker for Docker images. Every time you build a new version of the backend, you push the image here. When ECS needs to start a container, it pulls the image from ECR.

Our setup:

  • Repository URL: 239732221658.dkr.ecr.eu-west-2.amazonaws.com/softly-app
  • Lifecycle policy: keeps the last 10 images, automatically deletes older ones to save storage costs

How to check its status:

  • AWS Console: search for "ECR" and click "Repositories"
  • CLI: aws ecr describe-images --repository-name softly-app --region eu-west-2

Monthly cost: Less than $1


RDS (Relational Database Service) -- PostgreSQL

What it is in plain English: A managed PostgreSQL database. "Managed" means AWS handles backups, security patches, and hardware failures. You just read and write data. This is where all application data lives: users, tasks, documents, sessions, everything.

Our setup:

  • Endpoint: softly-production.cxo0ygu6alg0.eu-west-2.rds.amazonaws.com
  • Instance type: db.t3.micro (the smallest/cheapest option, free tier eligible)
  • Database name: softly_production
  • PostgreSQL version: 16
  • Encrypted at rest (data on disk is encrypted)
  • Automated daily backups with 1-day retention
  • The same database is used for four purposes (defined in backend/config/database.yml):
    • primary -- the main application data
    • cache -- Solid Cache (Rails built-in caching backed by the database)
    • queue -- Solid Queue (background job processing backed by the database)
    • cable -- Solid Cable (WebSocket/ActionCable backed by the database)

How to check its status:

  • AWS Console: search for "RDS" and click "Databases"
  • CLI: aws rds describe-db-instances --db-instance-identifier softly-production --region eu-west-2

Monthly cost: ~$15


ElastiCache -- Redis

What it is in plain English: A fast, in-memory data store. "In-memory" means it stores data in RAM instead of on a hard drive, making reads and writes extremely fast. We use it for Rails caching and Solid Queue background jobs.

Our setup:

  • Endpoint: softly-production.qlqnxr.0001.euw2.cache.amazonaws.com
  • Instance type: cache.t3.micro
  • Redis version: 7.0
  • Port: 6379

How to check its status:

  • AWS Console: search for "ElastiCache" and click "Redis caches"
  • CLI: aws elasticache describe-cache-clusters --cache-cluster-id softly-production --show-cache-node-info --region eu-west-2

Monthly cost: ~$13


S3 (Simple Storage Service) -- File Storage

What it is in plain English: Unlimited file storage in the cloud. Users upload documents through the app (passport photos, ID scans, etc.), and those files are stored in S3 via Rails Active Storage. S3 guarantees 99.999999999% durability, which means your files are essentially never lost.

Our setup:

  • Bucket name: softly-storage-production
  • Used for: Active Storage uploads (document vault)
  • There are also 3 older static site buckets that are no longer used (replaced by Cloudflare Pages)

How to check its status:

  • AWS Console: search for "S3" and click "Buckets"
  • CLI: aws s3 ls s3://softly-storage-production --region eu-west-2

Monthly cost: Less than $1


ALB (Application Load Balancer)

What it is in plain English: A traffic cop that sits between the internet and your application containers. When a request comes in, the ALB checks which containers are healthy and forwards the request to one of them. If you have multiple containers running (during high traffic), the ALB distributes requests evenly.

Our setup:

  • DNS name: softly-production-1592142071.eu-west-2.elb.amazonaws.com
  • Lives in the public subnets (can receive traffic from the internet)
  • Forwards HTTP traffic to ECS containers in private subnets on port 3000
  • Health check: sends GET /up to port 3000. If the container responds with 200 OK, the ALB considers it healthy
  • Currently HTTP only. HTTPS requires an ACM (AWS Certificate Manager) certificate, which is pending

How to check its status:

  • AWS Console: search for "EC2" then click "Load Balancers" in the sidebar
  • CLI: aws elbv2 describe-load-balancers --region eu-west-2

Monthly cost: ~$5


SSM Parameter Store -- Secrets Management

What it is in plain English: A secure vault for storing passwords, API keys, and other sensitive configuration. Instead of putting secrets in code (where anyone who reads the code gets all your passwords), you store them in SSM. The ECS containers read these values when they start up.

Our secrets:

Parameter PathWhat It Is
/softly/production/database_urlPostgreSQL connection string (includes host, username, password, database name)
/softly/production/rails_master_keyDecrypts Rails credentials file (needed to start the app)
/softly/production/anthropic_api_keyAPI key for Claude AI (powers the AI features)
/softly/production/deepgram_api_keyAPI key for Deepgram (voice/speech transcription)
/softly/production/revenuecat_api_keyAPI key for RevenueCat (subscription management) -- currently a placeholder
/softly/production/new_relic_license_keyLicense key for New Relic (performance monitoring) -- currently a placeholder

All parameters are stored as SecureString type, which means they are encrypted at rest using AWS KMS.

How to check:

bash
aws ssm describe-parameters --region eu-west-2 --query "Parameters[?starts_with(Name, '/softly/')]"

(This lists parameter names without revealing values. To see a value, use aws ssm get-parameter --name "/softly/production/PARAM_NAME" --with-decryption --region eu-west-2)


SES (Simple Email Service)

What it is in plain English: AWS's email sending service. When the app needs to send a password reset email or a verification email, it uses SES.

Our setup:

  • Domain: mysoftly.app (verified via a TXT record in Cloudflare DNS)
  • DKIM: 3 CNAME records that prove emails from mysoftly.app are legitimate (not forged)
  • DMARC: a TXT record that tells receiving email servers what to do with emails that fail authentication
  • Currently in sandbox mode: This means we can only send emails to verified email addresses. To send to any email address, we need to request production access from AWS

How to check:

  • AWS Console: search for "SES"
  • CLI: aws ses get-account-sending-enabled --region eu-west-2

VPC and Networking

What it is in plain English: A VPC (Virtual Private Cloud) is a private, isolated network within AWS. Everything we run on AWS lives inside this network. It is like renting a private office building where you control who can enter and which rooms connect to each other.

Our setup:

  • VPC IP range: 10.0.0.0/16 (65,536 private IP addresses)
  • 4 subnets across 2 Availability Zones (physically separate data centers in London):
    • Public subnet A (10.0.1.0/24): ALB and NAT Gateway live here
    • Public subnet B (10.0.2.0/24): ALB spans both public subnets
    • Private subnet A (10.0.3.0/24): ECS tasks, RDS primary, Redis
    • Private subnet B (10.0.4.0/24): ECS tasks, RDS standby
  • NAT Gateway: Lives in the public subnet. Allows resources in private subnets (like ECS containers) to access the internet (to pull Docker images, call external APIs) without being directly reachable from the internet. One-way door: traffic goes out, responses come back, but nobody from outside can initiate a connection in
  • Internet Gateway: The front door of the VPC. Connects the public subnets to the internet
  • Security Groups: Firewall rules attached to each resource:
    • ALB security group: accepts HTTP traffic from the internet
    • ECS security group: only accepts traffic from the ALB
    • RDS security group: only accepts traffic from ECS
    • Redis security group: only accepts traffic from ECS on port 6379

Monthly cost (NAT Gateway): ~$4 plus data transfer fees


5. Credentials and Auth -- Where Everything Lives

ToolConfig LocationHow to VerifyHow to Refresh
AWS CLI~/.aws/credentialsaws sts get-caller-identityaws configure
Cloudflare API~/.cloudflare/tokenCheck Cloudflare dashboardCreate new token in Cloudflare dashboard
Wrangler~/.wrangler/config/default.tomlwrangler whoamiwrangler login
VercelOAuth session (browser-based)vercel whoamivercel login
Docker/ECRSession token (expires after 12 hours)docker infoaws ecr get-login-password --region eu-west-2 | docker login --username AWS --password-stdin 239732221658.dkr.ecr.eu-west-2.amazonaws.com
GitHubmacOS Keychaingh auth statusgh auth login
EAS (Expo)OAuth sessioneas whoamieas login

6. Common Operations

Deploy the Backend

bash
cd backend

# Step 1: Build the Docker image for Linux (AWS uses Linux, your Mac might be ARM)
docker build --platform linux/amd64 -t 239732221658.dkr.ecr.eu-west-2.amazonaws.com/softly-app:latest .

# Step 2: Log in to ECR (token expires after 12 hours)
aws ecr get-login-password --region eu-west-2 | docker login --username AWS --password-stdin 239732221658.dkr.ecr.eu-west-2.amazonaws.com

# Step 3: Push the image to ECR
docker push 239732221658.dkr.ecr.eu-west-2.amazonaws.com/softly-app:latest

# Step 4: Tell ECS to pull the new image and restart
aws ecs update-service --cluster softly-production --service softly-production --force-new-deployment --region eu-west-2

The new container typically takes 1-2 minutes to start serving traffic. ECS does a rolling deployment: it starts the new container, waits for it to pass health checks, then stops the old one. There is no downtime.


Run a One-Off Rails Command in Production

Sometimes you need to run a command like db:seed or db:migrate in production. This spins up a temporary container that runs the command and then shuts down:

bash
aws ecs run-task \
  --cluster softly-production \
  --task-definition softly-production \
  --launch-type FARGATE \
  --network-configuration "awsvpcConfiguration={subnets=[SUBNET_ID],securityGroups=[SG_ID],assignPublicIp=DISABLED}" \
  --overrides '{"containerOverrides":[{"name":"rails","command":["bin/rails","db:seed"]}]}' \
  --region eu-west-2

Replace SUBNET_ID and SG_ID with the actual values from your infrastructure. You can find them in the Terraform state or the AWS console.


Deploy the Landing Page

bash
cd landing && npm run build && wrangler pages deploy dist --project-name=softly-landing --commit-dirty=true --branch=main

Deploy the Documentation

bash
cd docs && npm run build && wrangler pages deploy .vitepress/dist --project-name=softly-docs --commit-dirty=true --branch=main

Check if the API is Healthy

bash
# Via the custom domain
curl http://api.mysoftly.app/up

# Directly via the ALB (bypasses DNS)
curl http://softly-production-1592142071.eu-west-2.elb.amazonaws.com/up

A healthy response returns HTTP 200. If it fails, the container might be restarting or unhealthy.


Check ECS Container Logs

bash
# Search for errors in the last few log entries
aws logs filter-log-events \
  --log-group-name /ecs/softly-production \
  --filter-pattern "ERROR" \
  --limit 5 \
  --region eu-west-2

You can also view logs in the AWS Console: go to CloudWatch, then Log Groups, then /ecs/softly-production.


Make Infrastructure Changes

bash
cd infrastructure

# Step 1: See what would change (safe, does not modify anything)
terraform plan

# Step 2: Apply the changes (will ask for confirmation)
terraform apply

Always run terraform plan first and read the output carefully. It tells you exactly what will be created, modified, or destroyed.


7. Known Issues and Next Steps

  • API runs on HTTP only. The ALB does not have an SSL certificate. HTTPS support requires setting up an ACM certificate and adding an HTTPS listener to the ALB. Tracked in ticket mobile-1om
  • SES is in sandbox mode. Emails can only be sent to verified addresses. To send to real users, request production access from AWS via the SES console
  • RevenueCat and New Relic SSM parameters are placeholders. The values need to be updated when those services are set up
  • Web app needs API URL configured. The Vercel deployment needs the environment variable NEXT_PUBLIC_API_URL set to http://api.mysoftly.app (and updated to https:// once HTTPS is enabled)
  • App Store submission is still pending. The mobile app needs to be submitted via EAS

8. Cost Breakdown

ServiceMonthly Cost
ECS Fargate (1 task, 0.5 vCPU, 1 GB RAM)~$10-15
RDS db.t3.micro (PostgreSQL 16)~$15
ElastiCache cache.t3.micro (Redis 7)~$13
NAT Gateway~$4 + data transfer
ALB~$5
S3Less than $1
ECRLess than $1
Cloudflare (Pages, DNS, SSL, CDN)Free
Vercel (web app hosting)Free
Total~$50-60/month

All AWS resources are in eu-west-2 (London). Costs can vary slightly based on data transfer and actual usage. The ECS auto-scaling (1-3 containers) means compute costs increase during traffic spikes, but the maximum is capped at 3 containers.

Internal documentation — not for public distribution