Changelog

Every release, every shipped feature. We ship in public.

Follow updates on Twitter, or write to us at info@eigon.io.

v0.11.0

minor2026-05-15

Highlights

▸Deploys won't fail because of us. New pre-flight gate validates every project, environment, source, region and GitHub repo before the pipeline kicks. Transient AWS errors auto-retry. Every terminal failure is classified into Platform or User and the dashboard shows actionable copy, never a generic red banner
▸Production approval gate. Per-environment toggle blocks deploys until N teammates approve. Self-approval blocked. The approval pane lives directly on the deployment detail page with audit trail
▸Image-build LLM autofix. When CodeBuild's BUILD phase fails on the user's Dockerfile, Eigon analyzes the log tail with Claude and surfaces a structured fix proposal (file, line, suggestion, reason, confidence) instead of a wall of stack trace
▸PR previews are real. The webhook used to just record rows in the database; now a 30-second reconciler picks them up and provisions a 'pr-{N}' env with the PR branch and commit. Profile=dev, Spot=on, smallest footprint
▸Multi-region traffic routing. Flag a project multi-region and Eigon keeps a Route53 latency-routed alias in sync across every region where you've deployed. New region's env goes live → record set picks it up automatically. Region's env goes away → record drops
▸Copilot is grounded. Ask 'why did my deploy fail?' and it answers from the actual classified root cause stored against the deployment, not from a guess. Live operational signals (active incidents, spend-cap pause, pending approvals) feed into every assistant prompt
▸Plans are strictly per-user. Owning N organisations means ONE subscription that covers all of them. Schema collapses three legacy rows per user to one and removes the duplicate org-scoped sub the entitlements engine had to pick a primary from

Added

+Pre-flight gate (internal/pipelinegate) — sub-second checks for project linkage, BYOC verified, region resolution, GitHub repo + branch reachability with the saved token. Failures are typed (Code, Category PLATFORM_CONFIG | USER_INPUT, Hint) so the UI shows the right remediation
+Retry harness (pkg/awsretry) — generic exponential backoff for transient AWS errors. Smithy throttling codes plus message fragments (i/o timeout, connection reset, rate exceeded) bounce; permission errors pass through unchanged. Wraps CodeBuild StartBuild and SSM PutParameter today
+Deploy outcome classifier (internal/deployoutcome) — every pipeline failure tagged PLATFORM_TRANSIENT | PLATFORM_CONFIG | USER_CODE | USER_INPUT | UNKNOWN with a stable code and ready-to-show hint. Unknown errors land in platform-side so users never get blamed for our gaps. EmitMetric writes a structured log line CloudWatch scrapes into a platform-failure-ratio counter
+Image-build retry + structured failure surface — CodeBuild PRE_BUILD failures retry once on transient codes; terminal failures get a classified deployment event with category, code, hint
+Production approval policies — migration 0084 with deployment_approvals(deployment, approver, decision, comment) and UNIQUE(deployment, approver) so users can flip their vote by re-submitting. Gate enforced in /run returns 412 with 'deploy gated by approval policy: 0 / 2 approvals' until the threshold is met. Self-approval blocked
+Image-build LLM autofix (internal/imagebuildautofix) — analyzes BuildFailedError.LogTail with the project context and produces a structured Proposal (root_cause, category, file, line, suggestion, reason, confidence) persisted as a deployment event
+PR-preview reconciler (internal/prpreviews) — 30-second loop claims pending pr_previews rows via an atomic MarkBuilding, creates the ephemeral env with profile=dev and use_spot=true, attaches it to the preview row
+Multi-region routing (internal/multiregion) — migration 0085 with projects.multi_region_enabled + multi_region_hostname + multi_region_zone_id and multi_region_routes(project, region, target_dns, target_zone). 5-minute reconciler computes desired (region → ALB DNS) from the latest deployed env per region and writes Route53 ChangeResourceRecordSets UPSERTs in latency-routing mode
+Copilot grounding — gatherDeployments now appends the classified root cause from failure_analyses to every FAILED row; EnvConfig appends 'Operational signals' with the top 3 active incidents (last 24h), spend-cap pause state, and pending approval counts
+ApprovalPanel + ApprovalPolicyCard frontend components, ApprovalPanel renders on every deployment detail page (auto-collapses when not required), ApprovalPolicyCard added to env settings alongside canary + spending cap
+Env settings page at /dashboard/.../env/[envId]/settings — surfaces the canary config card (was orphaned in the codebase with no page rendering it), spending cap card, approval policy card. Backend canary watcher had been wired end-to-end for months but the toggle was unreachable in the UI
+getUserIdFromToken() helper decodes the JWT uid claim so client components can compare viewer identity without a /users/me round-trip
+Per-user plan model (migration 0083) — collapses each user's user_subscriptions rows to one (highest tier wins), drops UNIQUE(org_id), adds UNIQUE(user_id), nulls org_id. Self-heals so every org owner has a sub. Backend GetByOrg now resolves the org's owner and returns the owner's user-level sub
+Approval API proxies, canary API proxies, env-policy API proxies — Next.js app-router 404'd these previously

Improved

~BYOC enforcement on spending caps — awsops.ScaleEnvironmentToMinimum now decorates context with the project's BYOC target before loading AWS config. Previously cap enforcement on BYOC environments hit Eigon's hosted account and silently failed; the cluster wasn't there so ListServices returned nothing
~ECS task role's SSM permission widened from us-east-1-only to all regions so ap-south-1 deploys (Mumbai is the default region for IN users) can write the DB password to SSM
~CodeBuild project removed from no-NAT private subnets — was the root cause of every aws ecr get-login-password 'exit status 1' failure (3-minute timeout disguised as a docker-login error)
~Nightly E2E smoke uses prebuilt prod Dockerfiles in CI via docker-compose.ci.yml override. The default dev compose runs `air` and compiles Go on a cold mod-cache, which exceeds the /healthz wait every time on a clean runner — the nightly had been red for a week
~Invite emails rewritten as plain transactional messages (no styled CTA button, no dark theme, no marketing copy) with Reply-To set to the inviter so Gmail lands them in Primary instead of Promotions. DKIM, SPF and DMARC already pass on eigon.io; this is the content-side fix
~/invites/{token} accept page added — emails linked there but the page 404'd. Includes login redirect with ?redirect preservation and accepts or declines with auto-route to /dashboard/org/{orgId}
~First-class audit: spend caps and BYOC end-to-end. New tests cover threshold ladder boundaries, de-dup logic, the BYOC context annotation itself via a fake TargetHook. Previously zero tests on spendingcaps + awsops packages

v0.10.0

minor2026-05-04

Highlights

▸Auto fix PR for production crashes. Eigon notices a recurring crash in your service logs, parses the stack trace, asks Claude for a patch, and opens a draft pull request on your GitHub. One click from the dashboard banner
▸Cost dashboard now shows the actual AWS billed amount, pulled daily from Cost Explorer. The number you see matches the bill AWS will charge
▸Public signup is open. Visitors go straight to login and a first deploy without sitting on a waitlist

Added

+POST /api/deployments/:id/auto-fix endpoint orchestrates the full crash to PR pipeline. Extracts a Go, Python, Node, Ruby or Java stack signature from log lines, applies the dedupe policy (3 occurrence floor, 24h cool down, 5 PRs per day per project), fetches source files at the exact deploy commit SHA from GitHub, calls Claude with a focused prompt, then clones and applies the diff and opens a draft PR via the user's GitHub token
+Worker incident scanner ticks every 5 minutes, walks every project, env and service tuple, filters CloudWatch log events on common crash patterns, runs the signature extractor on hits and records incidents with a captured trace sample so the dashboard banner can replay them on click
+GET /api/projects/:id/incidents endpoint backs the recurring crash banner with occurrence count, last seen timestamp, language detected, and any existing fix PR URL
+Recurring crash banner on the deployment detail page with an Open fix PR with Eigon button when a trace has been captured
+Daily cost ingest cron in the controlplane runs cloudcost.IngestEngine against AWS Cost Explorer so cloud_cost_breakdowns is kept warm without manual admin trigger
+Three new shared packages: pkg/incidentdiag (stack trace parser, Postgres dedupe store, Claude and git and GitHub adapters), pkg/cloud/cwlogs (CloudWatch wrapper), pkg/scaletozero (idle service scaler), pkg/dbrollback (data rollback planner)

Improved

~Customer environment cost endpoint prefers actual MTD from cloud_cost_breakdowns over the engine estimate. Response carries a source flag so the UI labels Actual versus Estimate, plus an as_of_date so customers can see how fresh the AWS data is. Falls back to the engine estimate only on brand new environments where no usage has been ingested yet

v0.9.0

minor2026-04-30

Highlights

▸Process types are first class. Procfile web, worker, clock and release entries each deploy as their own service
▸Release tasks run database migrations once per deploy, before the new web image is promoted
▸Build cache backed by S3 with BuildKit, keyed by lockfile hash
▸Preflight catches AI-generated bugs before they ship: ephemeral SQLite, localhost in production env vars, hardcoded ports, missing dependencies
▸Predictive spend cap shows you'll hit your cap in N days at the current rate, not just an alert at 95 percent
▸Tenant-isolation audit blocks any new endpoint that forgets to verify org membership

Added

+Procfile parser fans a single repo with web, worker, clock and release lines into one service per process type with the right network shape. Workers and crons get no public IP, no ALB rule and no HTTP healthcheck
+Release tasks render as ECS TaskDefinitions with no long-running Service; the worker calls RunTask between Sceptre phases and aborts the deploy on a non-zero exit so a broken migration never overtakes a new web image
+BuildKit S3 cache backend keyed by lockfile hash, so the dependency-install layer survives across services and across deploys including a service's first build; falls back transparently to the previous ECR latest cache when the bucket is unset
+Cost prediction endpoint at GET /environments/:envId/cost/prediction returning days until cap, severity bucket and a one-line recommendation calculated from a 7-day rolling rate
+Preflight check framework with four rules at launch: ephemeral SQLite path, localhost in connection-string env vars, hardcoded listen port without process.env.PORT, hallucinated imports not present in package.json
+Static AST audit that fails CI when a new HTTP handler extracts a tenant-scoped URL parameter without verifying organisation membership
+Pluggable preflight phase in the deploy pipeline that persists each finding to the deployment timeline so users see the fix hint before the build runs

Improved

~Worker private subnets are read from the network stack output or the warm-base plan, with an explicit error when neither resolves a subnet so misconfigured release tasks surface instead of silently running in the wrong VPC
~Sceptre launch is now phased: phase one creates infrastructure plus release task definitions, then RunTask runs migrations, then phase two updates services. Old behaviour was a single launch that could promote a new image against an unmigrated database
~Service templates branch on type to drop public IP, listener rules and inbound security group rules for worker and cron services, removing unnecessary attack surface
~Cost prediction returns a non-NaN, deterministic result on day one of the month and on environments with no cost history yet, so the dashboard never has to special-case empty state

v0.8.0

minor2026-04-08

Highlights

▸WAF on every production environment by default
▸Monthly bandwidth ceilings with the same alert ladder as spending caps
▸Self-trained failure classifier that learns from resolved incidents
▸Multi-change what-if simulator for cost and risk planning
▸Compliance report generator aligned with SOC 2, GDPR and HIPAA

Added

+WAF toggle per environment, on by default for production profile, with PATCH endpoint for changing it later
+Monthly bandwidth ceiling in GB per environment with 50/80/95/100 threshold emails and Slack or Discord alerts
+Failure classifier that mines resolved failure analyses every six hours and returns a bucket plus confidence for new unknown errors
+POST endpoint for multi-change what-if simulations that sums cost deltas across up to twenty proposed changes and surfaces the worst risk level
+Compliance report endpoint that audits each environment against eighteen technical controls and returns a pass, fail, warning or not applicable verdict with cited evidence for each
+Public environment health badge served as an SVG at /environments/:id/badge.svg so customers can embed status in their README
+Dynamic OpenGraph image generator for the root of the site and for every blog post, producing branded social cards at build time
+VS Code extension scaffold with deploy, status and logs commands, buildable from cli/vscode-extension

Improved

~Orphan resource scanner now distinguishes hard and soft orphans, scans ECR, CloudWatch log groups, CloudFront distributions and unattached target groups, and runs hourly instead of daily
~Canary repo no longer writes to non-existent deployment state values, rollback now flips only the deployment status enum
~PR preview coordinator attributes webhook-triggered deploys to the organisation owner so the NOT NULL user column is satisfied
~AWS scale-down helper no longer passes an invalid scalable dimension filter to DescribeScalableTargets, which previously blocked the autoscaler pin step silently
~GitHub webhook handler is now a single consolidated endpoint with HMAC verification, PR preview dispatch and PR merge redeploy all in one place

v0.7.1

minor2026-04-07

Highlights

▸Canary deploys with automatic rollback
▸Email alerts for spending cap thresholds
▸Real graceful degradation when an environment hits its monthly cap

Added

+Canary monitoring window on every environment, configurable from the dashboard
+Background watcher that polls error rate after every deploy and rolls back automatically when the threshold is breached
+Configurable rollback threshold and minimum request count so quiet environments do not trigger noisy rollbacks
+Spending cap email alerts at 50, 80, 95 and 100 percent with branded HTML templates
+Celebratory email on the first ever successful deploy for each organisation

Improved

~Graceful degradation now actually scales every service down to its minimum and pins the autoscaler so it cannot fight back
~Hard stop now scales the affected services fully offline instead of only marking the environment paused
~Deploy completion is funnelled through a single transition point so future reliability features can plug in cleanly

v0.7.0

minor2026-04-07

Highlights

▸Programmatic access via API tokens and the eigon CLI
▸GitHub Action for one step deploys from CI
▸Terraform export for any environment

Added

+API tokens, scoped and revocable, managed from Account settings
+eigon CLI for deploys, status checks, log streaming and flag management
+GitHub Action that runs a deploy and waits for it to finish
+Spending cap card on every environment with live month to date usage
+Slack, Discord and generic webhook integrations for deploy and cap events
+Feature flags with per environment rollout percentages
+Terraform export bundle covering network, compute, database, cache, CDN and WAF

Improved

~Auth layer accepts both browser sessions and long lived API tokens
~Spending cap reconciliation runs hourly and triggers graceful degradation at the limit
~Recommendations engine surfaces cost and reliability suggestions on the environment page

v0.6.0

minor2026-04-04

Highlights

▸ZIP upload as a first class source
▸More resilient framework detection
▸Environment lifecycle management from the dashboard

Added

+Drag and drop ZIP upload as an alternative to a GitHub repository
+Atomic source switching between GitHub and ZIP, enforced at the data layer
+Versioned source storage so concurrent uploads cannot clash
+Delete environment flow with a type to confirm modal
+Admin cleanup page for orphaned cloud resources
+First deploy success modal with next step links

Improved

~Framework and runtime detection now falls back when source signals are weak
~Environment teardown is more robust and reports clearer status
~Region detection picks the right default on the very first deploy
~Background graph and metadata writes no longer contend with the request path

v0.5.0

minor2026-03-28

Highlights

▸Cost analysis with breakdown and suggestions
▸Eigon Copilot for environment level questions
▸Region selector with latency and cost scoring

Added

+Cost analysis page with category breakdown and one click apply for suggestions
+Eigon Copilot chat panel on every environment
+Region selector modal with latency and cost scoring
+Daily orphan resource scanner
+Plain language tooltips across the dashboard

Improved

~Deployment progress rail explains every phase in plain English
~Failure analysis panel lists likely causes and next steps
~Infrastructure graph uses friendly node names instead of cloud jargon
~Logs page filters with hover explanations for each log level

v0.4.0

minor2026-03-21

Highlights

▸Custom domains with auto SSL
▸Insights engine for performance and scaling
▸Self healing for failed nodes

Added

+Custom domain setup with CNAME instructions and one click verify
+Auto provisioned SSL certificates that renew on their own
+Self heal endpoint for failed nodes with safety guards
+Drift detection comparing template against deployed state
+Sizing recommendations driven by 30 days of metrics

v0.3.0

minor2026-03-14

Highlights

▸Multi region support
▸Production profile with WAF and CDN
▸Readiness scoring engine

Added

+Four production regions across North America, Europe and Asia
+Production profile that auto enables WAF and CDN
+Readiness scoring engine with rule based checks and one click fix packs
+Suggested code patches for the most common deploy failures

v0.2.0

minor2026-03-07

Highlights

▸Automated failure analysis
▸Fix packs for common deploy errors
▸First class variables and secrets

Added

+Failure analysis that classifies build and runtime errors and proposes fixes
+Auto applicable fix packs for low risk issues
+Variables page with bulk import from .env files
+Encrypted secret storage with per environment isolation

v0.1.0

beta2026-02-28

Highlights

▸First public beta
▸GitHub deploy flow
▸Managed cloud infrastructure

Added

+GitHub OAuth and repository selection wizard
+Framework auto detection across 45 plus stacks
+Managed container deployments with a load balancer in front
+Real time deployment events streamed to the dashboard
+Basic dashboard with logs and metrics

Building something with Eigon?

We would love to hear about it. Drop us a line at info@eigon.io.

Try Eigon