Jasmine Shone

Hi! I'm Jasmine Shone, a current student at MIT. This summer, I was a SWE Intern at Meta working on a computer vision model launch at Meta Superintelligence Labs and large-scale load testing at Instagram.

I am currently researching at the Kaiming He Lab, focusing on effective tokenization/representations for large scale chaotic system data in combination with latent diffusion models. I'm one of the co-leads for AI @ MIT Reading Group (and have been in the group for a total of 3 semesters now).

I enjoy learning about and exploring lots of different fields. Within AI that means Computer Vision, NLP, medical ML, deep learning, and robotics. Outside of AI that means data science, full-stack development, infrastructure, game development, and quantitative finance. In my free time, I enjoy singing/composing music, as well as writing short stories and personal essays.

Work Experience Awards Research Fun Blog Email Github

Work Experience

For a full list of my work experiences, please see my LinkedIn profile.

Hudson River Trading

4-Week Winter Internship in Core (SWE) & Algo Development (QR)

Winter 2025

Core (SWE): Completed five C++ projects focused on building infrastructure for live trading systems. Gained hands-on experience with performance optimization techniques, including efficient data structures and memory management. Actively participated in code reviews to improve code quality and robustness.

Algo Development: Designed, programmed, and deployed a live trading bot for Brazilian equities that achieved net positive returns in production. Applied data science techniques to train predictive models, evaluate performance metrics, and effectively communicate insights to stakeholders.

MIT Learning and Intelligent Systems Lab

Research Assistant

Summer 2024

Making robots generalize across varying object poses, camera views, and object instances with only 10 demonstrations.

Research

I'm currently interested in world-modelling and 3D generation. Relatedly, I'm interested in retrieval, memory, and context— making models that remember and see beyond immediate inputs/tasks. I'm also interested in intelligence which comes from compression, with feature/representation learning being a subset of that interest. Within representation learning, I've been curious recently about vision model representations and cross-modality information loss.

Latent Diffusion for PDEs

In Progress

Modeling cool physics

Contributions

Building out pipeline, running experiments

SAM 3D @ Meta Superintelligence

In Submission

launch paper

2D to 3D objects!

Contributions

Designing and implementing a modified version of the SuGaR algorithm, reducing runtime from 30 m to ~1 m 40 s. Proposing and implementing pointcloud downsampling algorithm which reduces points by up to an 12% average. Created parallelized script to improve training efficiency through data cleaning, detecting 6+% of corrupted data. Implemented baseline for evaluation metric and created easy-to-follow jupyter notebook.

Beyond I-Con: Exploring a New Dimension of Distance Measures in Representation Learning

NeurIPS NeurReps Workshop 2025, NeurIPS WiML Workshop 2025, In Submission

blog arXiv

We extend the I-Con framework to discover new losses which achieve state-of-the-art results on clustering and supervised contrastive learning.

Contributions

Designed, implemented, and evaluated loss functions based on the I-Con framework. Conducted analysis on experimental results.

Text-Invariant Similarity Learning

blog

We create a new paradigm of image-pair similarity learning conditioned on text descriptions.

Contributions

Designed and implemented image-pair similarity annotation pipeline. Designed, implemented, and explored different experimental setups and loss functions.

Keypoint Abstraction using Large Models for Object-Relative Imitation Learning

ICRA 2025, CoRL LangRob Workshop Best Paper 2024

project page arXiv

Utilizing priors from Vision-language models and image features to generalize effectively across object poses, camera viewpoints, and object instances with only 10 demonstrations.

Contributions

Trained multimodal models utilizing pointcloud encoders, object-wise transformers, and vision foundation models, improving evaluation performance by 27.5%
Created and designed keypoint proposal pipeline with specialized VLM prompting SAM, furthest point sampling, mark-based prompting, RANSAC, and point cloud clustering

The Correlation of Regional Gas Prices with Unemployment

Citadel Women's Datathon Winning Report 2024

Are fluctuations in gas prices predictive of unemployment rates and how do regional differences in mass transit and gasoline production affect this trend?

Contributions

Coming up with the initial research question + insight to use multiple additional datasets beyond the ones we were provided in the competition. Also worked on most of the stats/modelling work (Granger Causality, ADF, VAR modelling).

SketchAgent: Language-Driven Sequential Sketch Generation

CVPR 2025

project page

We introduce SketchAgent, a novel framework for generating sequential sketches from language prompts. By leveraging large language models and diffusion models, we achieve high-quality sketch generation that captures the essence of the input text.

Contributions

Worked on moving the LLM inference pipeline to a locally hosted version of Llama Vision 3.2 11B and Llama 3.2 90B. Learned about quantization, multi-gpu inference, and prompting techniques.

Systematic Optimization of App Generation Few Shot Learning for Large Language Models Trained on Code

Research Science Institute, Regeneron Talent Search Top 300, Acta Scientific 2023

Paper Regeneron Talent Search version

We create an LLM in-context learning pipeline to systematically optimize (1) maximum token length of the prompt, (2) the mechanism of choosing few-shot examples, and (3) the ordering of few-shot examples to generate applications.

Contributions

Developed the p-mRMR algorithm and prompt creation framework, designed and wrote an evaluation suite of test applications and performed manual error evaluation of generated apps.

Validating the Usage of Stable Diffusion Foundational Models on Generating Glaucoma Fundus Images in Low-Data Settings

Computer Vision Final Project

We improve stable diffusion's ability to generate high-quality fundus images of the eye, specifically for glaucoma, by finetuning on an extremely small dataset of 170 images.

Contributions

Worked on designing the project/experiments, training the model, doing evals with resnet, and baselines (GANs)

Jasmine Shone

Work Experience

Meta

Hudson River Trading

MIT Learning and Intelligent Systems Lab

Awards and Accolades

Research

Latent Diffusion for PDEs

SAM 3D @ Meta Superintelligence

Beyond I-Con: Exploring a New Dimension of Distance Measures in Representation Learning

Text-Invariant Similarity Learning

Keypoint Abstraction using Large Models for Object-Relative Imitation Learning

The Correlation of Regional Gas Prices with Unemployment

SketchAgent: Language-Driven Sequential Sketch Generation

Systematic Optimization of App Generation Few Shot Learning for Large Language Models Trained on Code

Validating the Usage of Stable Diffusion Foundational Models on Generating Glaucoma Fundus Images in Low-Data Settings

Fun Stuff

Music

Writing

Games