Yash Jain

Research Scientist, Essential AI Labs

051822_HeadShotDay_-236.jpg

Welcome! I am a Research Scientist at Essential AI Labs, where I build foundation models alongside Ashish Vaswani (Attention is All You Need). I was a core contributor to Rnj-1, a state-of-the-art open-source coding and agentic foundation model that has amassed 600k+ downloads.

Previously, I was an ML Scientist II at Microsoft Research, where I worked on diffusion models and multimodal large-language models in collaboration with Vibhav Vineet, publishing at CVPR, NeurIPS, NAACL, and EMNLP.

I graduated from Georgia Tech with an M.S. in Computer Science, advised by Zsolt Kira. Before that, I earned my B.Tech. in Computer Science from IIT Bombay, where I received an Excellence in Research Award under Soumen Chakrabarti.

Reach out by email if you wish to collaborate!

news

Dec 09, 2025 Released Rnj-1, a state-of-the-art open-source coding and agentic foundation model with 600k+ downloads on Hugging Face!
Jun 16, 2025 Joined Essential AI Labs as a Research Scientist!
Mar 13, 2025 Local Prompt Optimization Paper accepted at NAACL 2025 for Oral Presentation (Main Conference)!
Jun 05, 2023 Joined Microsoft as an ML Scientist II at Redmond!
Aug 05, 2022 Applied Scientist Intern at Amazon Alexa Team! Excited to train large-scale audio-visual models from scratch!
May 03, 2021 Finished B.Tech., got Excellence in Research Award from the department!

selected publications

  1. rnj1.png
    Rnj-1: Building Instruments of Intelligence
    Essential AI
    2025
    Model Release
  2. Aurelius: Relation Aware Text-to-Audio Generation At Scale
    Yuhang He, Yash Jain, Xubo Liu, Andrew Markham, and Vibhav Vineet
    In International Conference on Learning Representations (ICLR), 2026
  3. peekaboo.gif
    PEEKABOO: Interactive Video Generation via Masked-Diffusion
    Yash Jain, Anshul Nasery, Vibhav Vineet, and Harkirat Behl
    In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2024
  4. damex.jpg
    DAMEX: Dataset-aware Mixture-of-Experts for Visual Understanding of Mixture-of-Datasets
    Yash Jain, Harkirat Behl, Zsolt Kira, and Vibhav Vineet
    In Advances in Neural Information Processing Systems (NeurIPS), 2023
  5. 3m.jpg
    Multi-Stage Multi-Modal Pre-Training for Automatic Speech Recognition
    Yash Jain, D. Chan, P. Dheram, A. Khare, O. Shonibare, and 2 more authors
    In Joint International Conference on Computational Linguistics and Language Resources and Evaluation (LREC-COLING), 2024
  6. collossl.png
    Collossl: Collaborative Self-Supervised Learning for Human Activity Recognition
    Yash Jain, Chi Ian Tang, Chulhong Min, Fahim Kawsar, and Akhil Mathur
    Proceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies (UbiComp), 2022
  7. rfid.png
    RFID Tattoo: A Wireless Platform for Speech Recognition
    Jingxian Wang, Chengfeng Pan, Haojian Jin, Vaibhav Singh, Yash Jain, and 3 more authors
    ACM Interactive, Mobile, Wearable and Ubiquitous Technologies (UbiComp), 2020