I build resource-constrained machine learning (ML) systems for science – in vision, timeseries, and text domains – fusing approaches from statistical ML and CS systems.
I am currently a final-year PhD candidate advised by John Cunningham and working closely with Liam Paninski. I am also a part-time NLP researcher at MosaicML/Databricks, under Jonathan Frankle, where I work on LLM parameter-efficient finetuning, evaluation, and data, with an emphasis on code generation.
In my primary PhD work, I develop semi-supervised computer vision systems for tracking animals in videos, reducing the amount of labeled data needed for the task and improving generalization. Our package lightning-pose (bioRxiv, 2023; under review) is widely used in science and industry. I have also tackled this problem via probabilistic representation learning (NeurIPS, 2019, PLOS Comp. Biol., 2021), 3D vision, and physical simulation (NeurIPS DiffCVGP, 2020). In a second line of work, I focus on the computational efficiency and inductive biases of Gaussian processes (ICML, 2021). My ongoing NLP work at MosaicML addresses questions of knowledge acquisition and extinction and its interaction with parameter-efficient finetuning methods (more soon!).
Throughout my PhD, I collaborated closely with Lightning AI, named a Lightning Ambassador, and a featured developer in their first DevCon, June 2022.
Here is my CV.
PhD in Computational Neuroscience, 2018-
Columbia University
MA in Cognitive Science, 2018
Tel Aviv University
The Adi Lautman Interdisciplinary Program for Outstanding Students (Cog. Sci., Math, Neurobio.), 2013-2017
Tel Aviv University
Introduces a semi-supervised approach to pose estimation, using physically-informed inductive biases to improve generalization with fewer labels. Poses are further refined by combining deep ensembles with state-space models. Open-sourcing a deep learning system that is optimized for efficiency, building on PyTorch Lightning and NVIDIA DALI.
This model disentangles movement that can be quantified by keypoints (e.g., limb position) from subtler feature variations like orofacial movements. We introduce a novel VAE whose latent space decomposes into two orthogonal subspaces – one unsupervised subspace and one supervised subspace linearly predictive of labels (keypoints). The latent space additionally includes a context variable that predicts the video/subject identity.