Yuntian DengYoung Investigator (Postdoc), AI2
Incoming Assistant Professor, Waterloo CS (Starts Fall '24)
Faculty Affiliate, Vector Institute (Starts Fall '24)
PhD in CS, Harvard
[CV] [Google Scholar] [Twitter]
My research interests center on the intersection of natural language processing, machine learning, and multi-agent systems. Specifically, I am interested in exploring how large language models (LLMs) can communicate and collaborate to solve complex tasks together, and how they can be trained to specialize in different domains for a division of labor. My key focus areas include:
- Inducing Latent Language for Inter-LLM Communication: Developing methods to induce a specialized language for LLM communication, thereby enabling LLMs to leverage each other’ s expertise.
- Communication for Models Across Modalities: Extending Inter-LLM communication methods to enable collaboration among models that specialize in different modalities, such as language, image, and sensory data.
- Collaborative Training for Division of Labor among Models: Exploring ways to foster a division of labor among models, using communication as a tool to distribute knowledge among them during the training process.
- Mar 29, 2023: OpenAIWatch.com is launched! It tracks GPT-4's nondeterministic behavior even with greedy decoding in unicorn illustrations. 🦄
- Mar 29, 2023: Our GPT Chatbot, based on Yuvraj Sharma's code, is now live! It provides free acess to GPT with the aim of collecting dialogue data for research purposes.
- Oct 18, 2022: Our latest paper, Model Criticism for Long-Form Text Generation, is now publicly available! This paper uses model criticism in latent space to quantify various notions of high-level coherence in long-form text generation.
- Oct 12, 2022: Markup-to-Image Diffusion Models demo is now live! This project uses a diffusion model to learn how to render various types of markups, including LaTeX.
- Jun 2, 2020: Our latest paper, Cascaded Text Generation with Markov Transformers, is available! It allows parallel, fast, autoregressive, and accurate text generation using a high-order Markov model.
- Apr 26, 2020: Introducing Residual Energy-Based Models for Text Generation, a globally-normalized approach to text generation! Our approach uses a global discriminator to guide the traditional locally-normalized language model to produce text that's more indistinguishable from human-written text.
- Sep 5, 2019: Neural Linguistic Steganography demo is now live! This project lets you hide secret messages in natural language using arithmetic coding.
- Dec 19, 2016: Excited to introduce OpenNMT, an open-source neural machine translation toolkit developed for industrial and academic use.
- Sep 19, 2016: Excited to announce that we've provided a solution to OpenAI's requests-for-research im2latex challenge using neural sequence-to-sequence learning! Check out the visualizations here.
Markup-to-Image Diffusion Models with Scheduled Sampling
Yuntian Deng, Noriyuki Kojima, Alexander M. Rush.
Model Criticism for Long-Form Text Generation
Yuntian Deng, Volodymyr Kuleshov, Alexander M Rush.
Cascaded Text Generation with Markov Transformers
Yuntian Deng, Alexander M. Rush.
Residual Energy-Based Models for Text Generation
Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'Aurelio Ranzato.
Bottom-Up Abstractive Summarization
Sebastian Gehrmann, Yuntian Deng, Alexander Rush.
Latent Alignment and Variational Attention
Yuntian Deng*, Yoon Kim*, Justin Chiu, Demi Guo, Alexander M. Rush.
Image-to-Markup Generation with Coarse-to-Fine Attention
Yuntian Deng, Anssi Kanervisto, Jeffrey Ling, and Alexander M. Rush.
Neural Linguistic Steganography
Zachary Ziegler*, Yuntian Deng*, Alexander Rush.
EMNLP 2019 (Oral)
OpenNMT: Open-Source Toolkit for Neural Machine Translation
Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, Alexander M. Rush.
ACL Demo 2017 (Best Demo Runner-up)