Projects
Some freelance jobs and personal projects I've built over the years.
Resume
Senior Machine Learning Engineer – CVSHealth
July 2024 – July 2025
- Led the design and deployment of robust, production-grade ML systems to detect and block botnet web traffic in real time, training on 2TB of data and building scalable, maintainable ML pipelines.
- Developed and optimized a denoising variational autoencoder for anomaly detection with advanced deep learning architectures and rigorous unit and integration testing.
- Engineered custom tokenization and transformer-based embedding pipelines to convert sparse web traffic data into dense, model-ready tensors, reflecting DSP-style data processing.
- Championed MLOps best practices by automating model deployment, monitoring, and retraining workflows for continuous production quality.
- Collaborated cross-functionally with engineering and product teams to integrate ML features into production systems, supporting ongoing feature development and bug triage.
AI Engineer – Parker-Hannifin
February 2024 – July 2024
- Led end-to-end development of a retrieval-augmented (RAG) chatbot platform, architecting Python-based backend systems for scalable, production-ready deployment on Azure.
- Integrated ML models and MongoDB vector database systems into enterprise environments.
- Instituted CI/CD automation and code quality standards, mentoring junior engineers and fostering a culture of maintainable, high-quality software development.
Senior Data Scientist – PlanetX
November 2022 – July 2023
- Tripled the number of products available on PlanetX website using semantic NLP-based and fuzzy text-matching methods to improve record linking logic.
- Built scraper bots with Selenium and Python to source alternative data and link unmatched brands to companies.
- Expanded machine learning infrastructure on AWS EC2 and S3 services.
- Wrote ad hoc SQL queries to investigate core model statistics and fine tune scoring distributions.
- Worked closely with Product and Engineering teams to identify and fix anomalies and address user UX feedback.
Data Scientist – MUFG Investor Services
November 2019 – November 2022
- Investigated feasibility of forecasting investor capital movements.
- Used dbt to build data warehouse backend ETL processes in AWS.
- Wrote SQL queries to obtain raw data; cleaned and structured data to be used to train machine learning models.
- Built model training suite to rapidly train and track performance of hundreds of models.
- Used Bayesian optimization to tune LightGBM model hyperparameters.
- Achieved a lift value greater than 5.0 in targeting accounts likely to redeem their investment.
- Developed ESG reporting product to expand core fund administration business into rapidly growing ESG sector.
- Designed and led development of backend data architecture as well as frontend PowerBI user interface.
- Planned product roadmap toward further business cases given MUFG's unique position as a fund administrator.
- Data backend of ESG product used as basis for new company-wide data architecture.
- Expedited Ops Team account deduplication project by 10x using fuzzy matching to identify duplicate accounts.
- Developed a proof-of-concept system to extract information from invoices to reduce invoice processing workload.
- Parsed PDF files to obtain raw text; trained SpaCy NER (Named Entity Recognition) models to process text.
- Established performance baseline and foundation of project for further development.
Senior Risk Consultant – Process Risk, LLC
April 2016 – November 2019
- Facilitated Process Hazard Analysis studies and Management of Change discussions.
- Led teams of three to ten engineers and operators, serving as project consultant and client point of contact for projects at industrial chemical process complexes across the U.S.
- Cut project lifecycle time by 50% by writing Visual Basic macros to automate processes and deliverable creation.
Data Science Fellow – Metis
July 2018 – December 2018
Metis is a highly selective, immersive, full-time bootcamp in which I developed several data science projects, including:
- Time series analysis of federal economic data to forecast probability of economic recession
- Deseasonalized regression model to forecast subway ridership load by weather conditions
- Designed voice recognition algorithm to identify speaker in audio clips of presidential speeches
- Investigated trends in the focus of Congressional legislature using Latent Dirichlet Allocation (LDA) topic modeling.
My Journey to Data Science
I started my career as a chemical engineer in the mining industry--abroad in Santiago, Chile, no less. It was there that I got my first taste of datamining, as it was called back then, using R. Fast forward four years, and my obsession with Excel macros bore fruit in the form of a document generator for a commonly submitted report at work. Suddenly, I found myself firmly in the grip of the buzzing tech scene of San Francisco. I started building Ethereum miners and selling them online. I jumped at the opportunity to build an IoT-driven data dashboard at work. I was hooked. I enrolled in a machine learning bootcamp and snagged a job in finance. Before I knew it, I was forecasting investment transaction volumes and modeling ESG ratings. This all brought me to PlanetX, a startup that, while it lasted, sought to bring environmental transparency to online shopping.