AI Services Architecture

Softly uses Claude Haiku (Anthropic) for all AI features. The architecture prioritizes low cost, low latency, and structured output.

Model Choice

Model	Use Case	Why
Claude 3.5 Haiku	Task parsing, editing, suggestions	Fast (~500ms), cheap (~$0.001/call), good enough for structured extraction

Sonnet was evaluated but rejected for this use case. Haiku matches Sonnet's accuracy for structured extraction tasks at 1/10th the cost. The prompts are narrow enough that the smaller model handles them reliably.

Service Architecture

Three AI services under app/services/ai/:

TaskParser

Converts natural language into structured task fields.

Input: Free text + user's local date

Output: Title, due date, category, recurrence interval, reminder days

Prompt approach: System prompt defines the output JSON schema and the available categories. The user's local date is injected so relative dates ("next Tuesday", "in 3 months") resolve correctly. The model returns a JSON object that the service validates.

TaskEditor

Parses natural language edit instructions against existing tasks.

Input: Edit instruction + user's current task list + user's local date

Output: Task ID to modify + field changes

Prompt approach: The user's incomplete tasks (with IDs and titles) are included in the system prompt. The model matches the instruction to a task and returns the changes as JSON. This allows edits like "move dentist to Friday" to resolve "dentist" to the correct task.

SuggestionGenerator

Generates proactive task suggestions based on existing tasks, with recycling support.

Input: User's current task list + dismissed suggestion titles

Output: Array of suggested tasks with titles, categories, and reasons

Two-tier approach:

Users with no task history get curated starter suggestions from a pool of 16 (no AI call)
Users with task history get AI-generated suggestions based on gap analysis

Recycling: The generator queries DismissedSuggestion to exclude previously dismissed or added titles. For starter suggestions, these are filtered from the pool. For AI suggestions, dismissed titles are included in the prompt so Claude avoids repeating them, and results are double-checked with a post-filter.

The used_ai? method tracks whether an AI call was actually made, so starter suggestions don't count against the usage limit.

Cost Model

Based on Claude 3.5 Haiku pricing:

Metric	Value
Average tokens per parse	~200 input + ~100 output
Cost per parse	~$0.001
Free tier uses/month	5
Pro tier uses/month	200
Expected avg uses/user/month	~20
Cost per user/month	~$0.02

At scale (10,000 users), the AI cost would be approximately $200/month.

Error Handling

Each service defines a custom error class (ParseError, EditError, SuggestionError). The controller rescues these and returns 422 with the error message. Network errors to the Anthropic API bubble up as 500s and are logged.

Rate Limiting

Rate limiting is enforced at the user level, not the API level:

Free: 5 AI tasks/month
Pro: 200 AI tasks/month
Family: 200/month shared across the household, with per-member daily throttle when the pool is depleted

Usage resets on the 1st of each month. The usage_json helper returns current usage in every AI response so the mobile app can display remaining credits.

AI Services Architecture ​

Model Choice ​

Service Architecture ​

TaskParser ​

TaskEditor ​

SuggestionGenerator ​

Cost Model ​

Error Handling ​

Rate Limiting ​