Yash Jain
Welcome! I am a researcher at Microsoft specializing in the intersection of diffusion models—which are generative models used to create synthetic data—and multimodal large-language models that integrate various data types such as text and images. I collaborate closely with Vibhav Vineet on projects aimed at enhancing the capabilities of these models.
Previously, I graduated from Georgia Tech and finished my thesis under the mentorship of Zsolt Kira. Before that, I earned my bachelor’s in Computer Science from IIT Bombay, where I received an excellence in research award under the guidance of Soumen Chakrabarti.
Feel free to connect with me through the social links below.
news
Jun 05, 2023 | Joined Microsoft as an ML Scientist II at Redmond! |
---|---|
Aug 05, 2022 | Applied Scientist Intern at Amazon Alexa Team! Excited to train large-scale audio-visual models from scratch! |
May 03, 2021 | Finished B.Tech., got Excellence in Research Award from the department! |
selected publications
- Multi-Stage Multi-Modal Pre-Training for Automatic Speech RecognitionIn Joint International Conference on Computational Linguistics and Language Resources and Evaluation (LREC-COLING), 2024