Releasing an Alexa Skill Beta: Security Audits, Tests, and an Automated Release Workflow

Today I shipped v0.9.0-beta of Care Log, an Alexa skill that lets caregivers voice-log health data for family members. But this post isn’t about the skill itself—it’s about the Alexa skill release workflow I used to get from “code complete” to “tag pushed” with confidence. The process involved a security audit, a broken virtual environment, 74 passing tests, and a multi-agent release pipeline.

What We Were Releasing

Care Log is a Python-based Alexa skill running on AWS Lambda with DynamoDB, S3, and SES. It handles health data—vital signs, symptoms, activities—for multiple household members. Given the sensitivity of the data, I wanted a structured release process, not just “git tag and pray.”

The skill was feature-complete with 25+ intents, PDF report generation, email verification, and GDPR/CCPA compliance built in. Time to ship a beta.

The Alexa Skill Release Workflow

I used a structured release workflow that chains multiple specialized agents in sequence: Security Audit (aegis) → E2E Tests (atlas) → Final Review (review-agent) → Version Bump (herald) → Release Notes (scribe). Each phase must pass before the next runs. A critical security vulnerability or failing tests blocks the release.

Phase 1: Security Audit

The aegis agent scanned for OWASP Top 10 vulnerabilities, hardcoded credentials, and AWS misconfigurations. Results: 0 critical, 2 high (infrastructure configs), 4 medium, 4 low. Good news: no hardcoded secrets, no SQL injection, no command injection, and user IDs are SHA-256 hashed before storage.

Phase 2: The Broken Virtual Environment

Running tests revealed a problem I didn’t expect. The venv existed but had no packages. The interpreter path pointed to a different project (alexa-medical-logger-skill vs alexa-care-log-skill). The venv was created when the project had a different name and the symlinks were broken after a rename. The fix was simple: python3 -m venv .venv –clear, then reinstall dependencies.

Phase 3: 74 Tests, All Passing

With the venv fixed, tests ran cleanly: 74 passed, 28 warnings in 3.68s. Test breakdown: DynamoDB integration (12 tests), Household utilities (6 tests), Keyword extraction (12 tests), Threshold validation (21 tests), Time parsing (23 tests). The 28 warnings were all datetime.utcnow() deprecations—cosmetic, not blocking.

Phase 4: Release Review

The review-agent synthesized the security and test results. For a beta release, the HIGH infrastructure issues were acceptable since S3 CORS doesn’t matter when access is controlled via signed URLs, and SES sandbox mode already limits recipients to verified addresses. Verdict: RELEASE_APPROVED

Phase 5: Version Bump and Documentation

The herald agent created: src/__version__.py with version tracking, CHANGELOG.md following Keep a Changelog format, and updated template.yaml with version in description and environment variable. The scribe agent generated comprehensive release notes covering features, known issues, deployment steps, and future roadmap.

The Git Workflow

With all phases complete, the actual release was straightforward: stage and commit release files, create an annotated tag, push to origin with tags, then create a GitHub release from the release notes using the gh CLI.

What I Learned

1. Virtual Environments Break in Surprising Ways: Renaming a project directory doesn’t update venv symlinks. The –clear flag on python -m venv is your friend when things get weird.

2. Security Audits Surface Infrastructure Issues: Both HIGH findings were AWS SAM template configs, not application code. A security audit isn’t just about SQL injection—it’s about the full deployment surface.

3. Beta ≠ Careless: Even for a beta, I wanted documented security posture and 100% test pass rate. The known issues are documented and acceptable for scope, not ignored.

4. Structured Release Workflows Catch Problems: The broken venv would have been caught earlier if I’d run tests before the release workflow. But the workflow’s explicit “run tests” phase surfaced it cleanly rather than letting it hide until production.

The Care Log Skill

For context on what was actually released: Purpose: Voice-first health logging for family caregivers. Stack: Python 3.11, Lambda, DynamoDB, S3, SES. Features: Vital signs, symptoms, activities, seizures, PDF reports, GDPR compliance. Tests: 74 passing (unit + integration with moto). Status: Beta—suitable for personal testing, not HIPAA certified. The skill itself is a story for another post. Today was about the release process.

Comprehensive WordPress Plugin Security Audit – Another security audit workflow, this time for a WordPress plugin
Creating a Self-Documenting Claude Code Skill for Technical Blog Posts – The skill that generated this post