Yuntian Deng

Yuntian Deng

Young Investigator, AI2 Mosaic
Assistant Professor, Waterloo CS (Starts Fall '24)
Faculty Affiliate, Vector Institute (Starts Fall '24)
PhD in CS, Harvard
[CV] [Google Scholar] [Twitter]

I am a postdoc at the Mosaic team at AI2 and an incoming assistant professor at the University of Waterloo. My research interests are Natural Language Processing and Machine Learning. My research philosophy is to "build AIs for machines": since machines and humans have different characteristics, the ways humans solve problems might not necessarily be optimal for machines. For example, humans communicate using language, but there might be more effective ways of communication between machines than human language. Therefore, in the near future, I plan to study communication and cooperation among multiple models (aka agents):

  • Communication among Models: I plan to learn a communication language between models that is more effective than human language (such as continuous vectors or discrete semantic representations).
  • Division of labor and cooperation: Unlike the current practice of training a single all-capable large model, I plan to train a group of small specialized models with different capabilities to complete tasks through communication and cooperation.

I also work on open-source projects such as OpenNMT, Im2LaTeX, LaTeX2Im, and Steganography to make my research efforts more readily available for developers and researchers.

News

  • Mar 5, 2024: Our dataset, WildChat, is used in Anthropic's Claude 3 for evaluating refusals.
  • Nov 14, 2023: Our dataset, WildChat, is now publicly available! It is a corpus of 650K real-world user-ChatGPT interactions, characterized by over 60 languages and a diversity of user prompts.
  • Nov 7, 2023: Our paper, Implicit Chain of Thought Reasoning via Knowledge Distillation, is now publicly available! This paper trains LMs that can reason internally using hidden states instead of articulating all reasoning steps like humans.
  • Mar 29, 2023: OpenAIWatch.com is launched! It tracks GPT-4's nondeterministic behavior even with greedy decoding in unicorn illustrations. 🦄
  • Mar 29, 2023: Our GPT Chatbot, based on Yuvraj Sharma's code, is now live! It provides free acess to GPT with the aim of collecting dialogue data for research purposes.
  • Oct 18, 2022: Our paper, Model Criticism for Long-Form Text Generation, is now publicly available! This paper uses model criticism in latent space to quantify various notions of high-level coherence in long-form text generation.
  • Oct 12, 2022: Markup-to-Image Diffusion Models demo is now live! This project uses a diffusion model to learn how to render various types of markups, including LaTeX.
  • Jun 2, 2020: Our paper, Cascaded Text Generation with Markov Transformers, is available! It allows parallel, fast, autoregressive, and accurate text generation using a high-order Markov model.
  • Apr 26, 2020: Introducing Residual Energy-Based Models for Text Generation, a globally-normalized approach to text generation! Our approach uses a global discriminator to guide the traditional locally-normalized language model to produce text that's more indistinguishable from human-written text.
  • Sep 5, 2019: Neural Linguistic Steganography demo is now live! This project lets you hide secret messages in natural language using arithmetic coding.
  • Dec 19, 2016: Excited to introduce OpenNMT, an open-source neural machine translation toolkit developed for industrial and academic use.
  • Sep 19, 2016: Excited to announce that we've provided a solution to OpenAI's requests-for-research im2latex challenge using neural sequence-to-sequence learning! Check out the visualizations here.

Selected Papers

Implicit Chain of Thought Reasoning via Knowledge Distillation
Yuntian Deng, Kiran Prasad, Roland Fernandez, Paul Smolensky, Vishrav Chaudhary, Stuart Shieber.
In submission

(InThe)WildChat: 570K ChatGPT Interaction Logs In The Wild
Wenting Zhao, Xiang Ren, Jack Hessel, Claire Cardie, Yejin Choi, Yuntian Deng.
ICLR 2024 Spotlight & used in Anthropic's Claude 3 for evaluating refusals

Tree Prompting: Efficient Task Adaptation without Fine-Tuning
John Xavier Morris*, Chandan Singh*, Alexander M. Rush, Jianfeng Gao, Yuntian Deng.
EMNLP 2023

Markup-to-Image Diffusion Models with Scheduled Sampling
Yuntian Deng, Noriyuki Kojima, Alexander M. Rush.
ICLR 2023

Model Criticism for Long-Form Text Generation
Yuntian Deng, Volodymyr Kuleshov, Alexander M Rush.
EMNLP 2022

Cascaded Text Generation with Markov Transformers
Yuntian Deng, Alexander M. Rush.
NeurIPS 2020

Residual Energy-Based Models for Text Generation
Yuntian Deng, Anton Bakhtin, Myle Ott, Arthur Szlam, Marc'Aurelio Ranzato.
ICLR 2020

Bottom-Up Abstractive Summarization
Sebastian Gehrmann, Yuntian Deng, Alexander Rush.
EMNLP 2018

Latent Alignment and Variational Attention
Yuntian Deng*, Yoon Kim*, Justin Chiu, Demi Guo, Alexander M. Rush.
NIPS 2018

Image-to-Markup Generation with Coarse-to-Fine Attention
Yuntian Deng, Anssi Kanervisto, Jeffrey Ling, and Alexander M. Rush.
ICML 2017

Neural Linguistic Steganography
Zachary Ziegler*, Yuntian Deng*, Alexander Rush.
EMNLP 2019 (Oral)

OpenNMT: Open-Source Toolkit for Neural Machine Translation
Guillaume Klein, Yoon Kim, Yuntian Deng, Jean Senellart, Alexander M. Rush.
ACL Demo 2017 (Best Demo Runner-up)

Prospective Students

I plan to hire multiple PhD/MS students at UWaterloo, home to 5 NLP professors! Strong consideration will be given to those who can tackle the below challenge by Feb 2024: Can we use LM's hidden states to reason multiple problems simultaneously? See this picture for details.

Important Note: Due to the high volume of inquiries, I kindly request that all prospective students direct their initial communications through the submission of their response to the challenge mentioned above. Please refrain from sending reminders without addressing this challenge. This approach ensures that your application is considered with the attention it deserves and aligns with our group's focus on meaningful and engaged research discussion.