Work

Projects

Infrastructure work, data systems, and engineering challenges — with the metrics that actually matter.

DataBackendDevOpsFeatured

Real-Time Event Pipeline

Legacy batch pipeline couldn't handle peak traffic spikes, causing 4-6 hour data delays in downstream ML models.

Role: Lead Data Engineer

Apache KafkaApache FlinkPythonAWS MSKPostgreSQL

Impact

Handled 12M+ events/day

Reduced end-to-end latency from 4h to 90s

99.97% uptime over 6 months

MLDataBackendFeatured

ML Feature Store

Data science teams were duplicating feature engineering logic across 8+ models, causing inconsistencies and wasted compute.

Role: Backend & ML Infrastructure Engineer

PythonFeastRedisSparkAWS S3Airflow

Impact

Reduced feature computation time by 63%

Unified 40+ features across 8 models

Cut model deployment time by half

DataDevOpsFeatured

Data Warehouse Migration

Migrating a 10TB legacy Redshift warehouse to Snowflake with zero downtime and full historical parity.

Role: Data Platform Engineer

dbtPythonSnowflakeAWS GlueTerraformGitHub Actions

Impact

Migrated 10TB+ data with zero data loss

Reduced query costs by 41%

Zero downtime migration

DataDevOps

ETL Orchestration Platform

Fragmented cron-based ETL scripts with no observability, retries, or dependency management — failed silently.

Role: Data Engineer

Apache AirflowPythonDockerAWS ECSPostgreSQL

Impact

Orchestrated 120+ DAGs in production

Reduced pipeline failures by 78%

Full lineage and alerting coverage

AIBackendData

LLM-Powered Data Extractor

Unstructured documents (PDFs, emails, reports) contained key business data that was inaccessible for analysis.

Role: AI & Data Engineer

PythonOpenAI APILangChainFastAPIPostgreSQL

Impact

Extracted structured data from 50K+ documents

91% accuracy on key entity extraction

Saved ~300 manual hours/month

See the infrastructure I've built.

Interested in working together or learning more about a specific project?

Get in touch