Repro: Network Dissection (Bau et al, 2017)
Reproduced mech interp technique for automatically identifying channels that detect visual concepts. PyTorch, convolutional networks.
Hi. I'm looking for work as an AI alignment research engineer or ML engineer.
Reproduced mech interp technique for automatically identifying channels that detect visual concepts. PyTorch, convolutional networks.
The original deconvolutional visualization technique from Zeiler and Fergus, 2013. PyTorch, computer vision.
Built transformer from scratch (43M parameters) achieving 3 perplexity on WikiText. PyTorch, attention mechanisms, training optimization.
Co-authored research on activation steering in LLMs. Created contrastive datasets, ran benchmarks on Gemma 2 and Llama models. Python, HuggingFace, GPU computing.
Extended open-source activation steering library to support Gemma 2 and Llama 3. Python, model architectures, layer-wise interventions.
Co-authored research paper on LLM alignment stability. Designed multi-armed bandit experiments, built evaluation framework. Python, OpenAI API, Jupyter.