PROJECTS
U-Net LowLight Image Enchancement
End-to-end burst denoising pipeline built in PyTorch. Trained a U-Net on self-captured RAW image sequences from a Sony a6600 with 10-frame bursts at -5EV paired with ground truth exposures across 19 scenes. The model takes 10 aligned RAW frames as input and outputs a single enhanced image, achieving ~27 dB PSNR. Frame alignment handled with OpenCV and training used a composite loss combining reconstruction, edge preservation, brightness consistency, and high-frequency retention
Diffusion Burst Ablation Study
An empirical study of burst-conditioned score function estimation in diffusion-based low-light image restoration. IR-SDE was modified to accept N burst frames as additional conditioning at each reverse diffusion step, and evaluate whether burst count affects denoising quality on the SID dataset.
AI Native Configuration Management Database
Most CMDBs break down when you introduce AI — hallucinations corrupt the source of truth, and there's no audit trail when something goes wrong. This project explores how to build an AI-native database that doesn't make that tradeoff. The core design principle is deterministic-first: parsing and normalization happen without AI involvement, and the model only enters the pipeline where inference provides clear value through entity canonicalization and natural language querying. A catalog-grounded RAG layer handles entity matching against approved hardware and application catalogs, with hallucination detection and automatic retry logic before any unmatched record routes to human review. Nothing becomes ground truth without explicit sign-off. The NL-to-SQL interface runs as a three-stage pipeline: table routing, schema-scoped SQL generation with chain-of-thought prompting, and plain-English summarization with a self-correction pass on execution failure. Built with Python, LangChain, Anthropic API, FastAPI, and SQLAlchemy.
Celebrity Digital Twin via RAG-Indexed Podcast Transcripts
Developing a high-fidelity digital twin of a public figure using a Retrieval-Augmented Generation (RAG) architecture trained on a collection of scraped podcast transcripts. The project aims to evaluate the effectiveness of this methodology at predicting target persona answers and conversational style
AWS x NFL Big Data Bowl Analysis
A data science analysis of the 2023 NFL season submitted to the AWS x NFL Big Data Bowl competition, built on official NFL player tracking data with frame-by-frame GPS coordinates for every player on every passing play across 18 weeks. Six research questions span offensive strategy, individual player performance, and environmental factors. The more interesting findings: Cover 3 Zone was the worst defensive coverage against every offensive formation tested; elite route runners were identified purely by nearest-defender distance at ball arrival, independent of catch outcomes or yards after contact; and the cold-weather-shortens-passes hypothesis found no support in the data — several cold-climate teams increased their average depth of target late in the season. The analysis enforces minimum sample size thresholds throughout and reports confidence intervals on regression results rather than treating all findings as equivalent.