AI/ML Product Engineering at Tessact
Promoted for architecting cost-efficient ML services. Building the company's flagship AI/ML platform as sole engineer — from system architecture and LLM orchestration to full-stack development and cloud deployment.
Tech Stack
Python
TypeScript
FastAPI
Django
PydanticAI
React
Next.js
Remotion
OpenAI GPT-5
Google Gemini 2.5
InsightFace
ElevenLabs
Docker
GCP Cloud Run
AWS Lambda
Celery
RabbitMQ
PostgreSQL
Redis
GitHub Actions
Sentry
AI Video Repurposing Platform
- Built end-to-end product as sole engineer with full ownership: system architecture, AI/LLM orchestration, React frontend, Django backend, ML service deployment, and production scaling.
- Architected microservices system: TessactAI service (FastAPI + PydanticAI) for LLM orchestration, Django backend with Celery/RabbitMQ job queue, React/Next.js frontend, and Remotion rendering service on AWS Lambda.
- Engineered parallel processing pipeline that turns 2-hour podcasts into 30–40 branded short-form clips in under 1 hour — 200× faster than manual editing (4 weeks → 1 hour).
- Achieved 95% ready-to-post quality (measured via client feedback), eliminating manual editing in most cases.
- Processed 500+ hours of podcast content since closed beta launch (Jan 31, 2026). Drives company's public launch (Feb 2026) at $20/month pricing.
LLM Orchestration & AI Decision-Making
- Built PydanticAI-based LLM orchestration across three features: clip selection from transcripts, speaker name resolution (Speaker 1 → 'John Doe'), and AI-generated enhancement timing.
- A/B tested OpenAI GPT-4/5 vs Google Gemini 2.5 Flash — switched to Gemini achieving better quality at 60% lower cost.
- Designed structured LLM prompts with effect catalogs, timing constraints, and confidence scoring for production-ready outputs.
Cost Engineering & Infrastructure
- Reduced transcription costs 72%: migrated AWS Transcribe ($1.44/hr) to ElevenLabs Scribe v2 ($0.40/hr) while improving accuracy (10% lower word-error-rate).
- Deployed Remotion-based video generation on AWS Lambda with parallel rendering — renders 30-second clips in ~30 seconds.
- Containerized all services with Docker, deployed on GCP Cloud Run and AWS Lambda for auto-scaling.
- Delivered platform at $5/hour processing cost vs $1,000+ market price — 200× cost advantage.
DevOps & Engineering Practices
- Set up CI/CD pipeline with GitHub Actions for automated testing, linting, and deployment.
- Implemented comprehensive test suites using pytest for backend services and Jest for frontend.
- Monitoring with Sentry — 97%+ crash-free sessions. Pre-commit hooks and Ruff for code quality.
- Used Claude Code and GitHub Copilot for faster development cycles while maintaining code quality.