I am a (Now() - 04/2021).ceil().ordinal()
year PhD student studying natural language processing (NLP) at University of Washington.
I am fortunate to be advised by Prof. Yejin Choi and Prof. Hanna Hajishirzi.
I am also a part-time researcher at the Allen Institute for AI.
My current research topics are inspecting massive text corpora, training data attribution, LM pretraining, and scaling laws. During my PhD, I have worked on commonsense knowledge generation and verification, automated theorem proving, RLHF, and text decoding.
Previously, I received B.S. in Computer Science from University of Illinois at Urbana-Champaign, where I worked with Prof. Julia Hockenmaier. I used to work in Facebook’s Natural Language Generation (NLG) team.
My name in Chinese characters is 刘嘉程
Email: liujc [at] cs.washington.edu
[CV] [Google Scholar] [GitHub] [Twitter] [LinkedIn]
Research and other blogs: this website and [Zhihu]
Private pilot and other personal life VLOGs: [Bilibili] [YouTube]
Personal: [Facebook]
News
- (2024.07) Infini-gram and PPO-MCTS are accepted to COLM 2024.
- (2023.10) PPO-MCTS is featured by 机器之心 on WeChat!
- (2023.10) Vera and Crystal are accepted to EMNLP 2023 (main conference).
- (2023.09) The Inverse Scaling paper is accepted to TMLR! Check out our contributed dataset,
memo-trap
, where LLMs demonstrate the strongest inverse scaling trends. - (2023.07) I am awarded the Qualcomm Innovation Fellowship for academic year 2023-2024.
- (2023.05) Invited talk the the MLNLP Seminar: Estimating the plausibility of commonsense statements.
- (2023.02) Our submission to the Inverse Scaling Challenge,
memo-trap
, receives one of the 11 Third Prizes!
Publications
Preprints
AI as Humanity’s Salieri: Quantifying Linguistic Creativity of Language Models via Systematic Attribution of Machine Text against Web Text
Ximing Lu, Melanie Sclar, Skyler Hallinan, Niloofar Mireshghallah, Jiacheng Liu, Seungju Han, Allyson Ettinger, Liwei Jiang, Khyathi Chandu, Nouha Dziri, Yejin Choi
[Arxiv]
[Demo]
Peer-Reviewed Papers
Unpacking DPO and PPO: Disentangling Best Practices for Learning from Preference Feedback
Hamish Ivison, Yizhong Wang, Jiacheng Liu, Zeqiu Wu, Valentina Pyatkin, Nathan Lambert, Noah A Smith, Yejin Choi, Hannaneh Hajishirzi
NeurIPS 2024
[Arxiv]
[Code]
[Models]
Infini-gram: Scaling Unbounded n-gram Language Models to a Trillion Tokens
Jiacheng Liu, Sewon Min, Luke Zettlemoyer, Yejin Choi, Hannaneh Hajishirzi
COLM 2024 (Oral Spotlight, 2%)
[Arxiv]
[Project Page]
[Demo]
Don’t throw away your value model! Making PPO even better via Value-Guided Monte-Carlo Tree Search decoding
Jiacheng Liu, Andrew Cohen, Ramakanth Pasunuru, Yejin Choi, Hannaneh Hajishirzi, Asli Celikyilmaz
COLM 2024
[Arxiv]
[Code]
Are machines better at complex reasoning? Unveiling human-machine inference gaps in entailment verification
Soumya Sanyal, Tianyi Xiao, Jiacheng Liu, Wenya Wang, Xiang Ren
ACL 2024 (Findings)
[Arxiv]
[Model]
MathVista: Evaluating Mathematical Reasoning of Foundation Models in Visual Contexts
Pan Lu, Hritik Bansal, Tony Xia, Jiacheng Liu, Chunyuan Li, Hannaneh Hajishirzi, Hao Cheng, Kai-Wei Chang, Michel Galley, Jianfeng Gao
ICLR 2024 (Oral); NeurIPS 2023 MATH-AI Workshop
[Arxiv]
[Project Page]
[Code]
[Dataset]
[HF Dataset]
Crystal: Introspective Reasoners Reinforced with Self-Feedback
Jiacheng Liu, Ramakanth Pasunuru, Hannaneh Hajishirzi, Yejin Choi, Asli Celikyilmaz
EMNLP 2023 (Main Conference, Oral)
[Arxiv]
[Code]
[Models: large 3b 11b]
[Demo]
Vera: A General-Purpose Plausibility Estimation Model for Commonsense Statements
Jiacheng Liu, Wenya Wang, Dianzhuo Wang, Noah A. Smith, Yejin Choi, Hannaneh Hajishirzi
EMNLP 2023 (Main Conference, Oral)
[Arxiv]
[Code]
[Model]
[Demo]
[Dataset]
Inverse Scaling: When Bigger Isn’t Better
Ian R McKenzie, …, Jiacheng Liu, …, Samuel R Bowman, Ethan Perez
TMLR (2023.10)
[Arxiv]
Draft, Sketch, and Prove: Guiding Formal Theorem Provers with Informal Proofs
Albert Qiaochu Jiang, Sean Welleck, Jin Peng Zhou, Timothee Lacroix, Jiacheng Liu, Wenda Li, Mateja Jamnik, Guillaume Lample, Yuhuai Wu
ICLR 2023 (Oral, 5%)
[Arxiv]
Rainier: Reinforced Knowledge Introspector for Commonsense Question Answering
Jiacheng Liu, Skyler Hallinan, Ximing Lu, Pengfei He, Sean Welleck, Hannaneh Hajishirzi, Yejin Choi
EMNLP 2022 (Main Conference)
[Arxiv]
[Code/Data]
[Models: Policy Value]
[Demo]
NaturalProver: Grounded Mathematical Proof Generation with Language Models
Sean Welleck, Jiacheng Liu, Ximing Lu, Hannaneh Hajishirzi, Yejin Choi
NeurIPS 2022
[Arxiv]
[Code]
NaturalProver: Grounded Natural Language Proof Generation with Language Models
Sean Welleck, Jiacheng Liu, Ximing Lu, Hannaneh Hajishirzi, Yejin Choi
AITP 2022 (Contributed Talk)
[Talk]
Generated Knowledge Prompting for Commonsense Reasoning
Jiacheng Liu, Alisa Liu, Ximing Lu, Sean Welleck, Peter West, Ronan Le Bras, Yejin Choi, Hannaneh Hajishirzi
ACL 2022 (Main Conference)
[Arxiv]
[Code]
[Talk]
[Poster]
Towards Grounded Natural Language Proof Generation
Sean Welleck, Jiacheng Liu, Jesse Michael Han, Yejin Choi
NeurIPS 2021 MATHAI4ED Workshop (Contributed Talk)
[Talk]
[Poster]
NaturalProofs: Mathematical Theorem Proving in Natural Language
Sean Welleck, Jiacheng Liu, Ronan Le Bras, Hannaneh Hajishirzi, Yejin Choi, Kyunghyun Cho
NeurIPS 2021 Datasets and Benchmarks Track (Oral, 1%)
[Arxiv]
[Data/Code/Models]
[Project Page]
[Talk]
[Slides]
NaturalProofs: Mathematics meets Natural Language
Sean Welleck, Jiacheng Liu, Ronan Le Bras, Hannaneh Hajishirzi, Yejin Choi, Kyunghyun Cho
AITP 2021 (Contributed Talk)
[Talk]
[Slides]
Phrase Grounding by Soft-Label Chain Conditional Random Field
Jiacheng Liu, Julia Hockenmaier
EMNLP-IJCNLP 2019 (Oral)
[Arxiv]
[Code]
[Slides]
CrossWeigh: Training Named Entity Tagger from Imperfect Annotations
Zihan Wang, Jingbo Shang, Liyuan Liu, Lihao Lu, Jiacheng Liu, Jiawei Han
EMNLP-IJCNLP 2019 (Oral)
[Arxiv]
[Code]
[Slides]
Posts
The embarrassing redundancy of reward whitening and reward normalization in PPO
In this post, I will theoretically prove that two common implementation tricks in PPO – reward whitening and reward normalization – are unnecessary and can be emulated by adjusting other free parameters.
Reflections on Commonsense Explanations
To tackle the task of commonsense question answering, numerous work have proposed to ground the reasoning into explanations or relevant commonsense knowledge (Liu et al., 2021; Liu et al. 2022; Wang et al., 2022; inter alia). In this blog post, I reflect on whether these approaches are really logically sound and bullet-proof.
What is missing from ChatGPT / GPT-4?
ChatGPT and GPT-4 are remarkable engineering breakthroughs. In this post I reflect on what are still missing from these models, and most modern LLMs in general.
Handling the absorbing state in Beam Search Decoding [zh]
A note on BART
Theorem Proving - reading notes [zh]
稼轩词中的夸张和对比手法 [zh]
稼轩词怀古忧国,通常气势磅礴,读之使人酣畅淋漓,私以为其最重要的因素是运用大胆的夸张和对比。本文将试举几例,加以赏析。
A Dummy Trading Strategy - II
In this previous post we discussed how you can buy an asset early and cheap, and take incremental profit as it skyrockets. This does not tell you how to buy it (back) at market dip. In this post, we introduce a unified buying-and-selling strategy so that you can make automated trading decisions and make a profit in a volatile market.
Optimal Stopping [zh]
Bottom-Fishing in Market Crash
Every couple of years, there is a market correction, or even a market crash. However, historical data tells us that the market will always come back, so we want to buy at cheap price during such time, in hope to make a large return when the market comes back. How should we do this given no one knows where the “real” bottom is in advance? We need a strategy.
A Dummy Trading Strategy
There are times in the financial market when a certain asset is traded at a low price but it has huge potential for speculation. Back in 2013 you could buy bitcoins at \$13.50, and by the end of 2017 they were traded at nearly \$20k. When Tesla (TSLA) and Nio (NIO) took off in 2020, their stock prices rallied by 9x and 25x, respectively. In the recent WallStreetBets war with shorting institutions, Gamestop (GME) has had its price skyrocketed because of a short squeeze. The questions is, how can we seize these opportunities and make some profit?
Boltzmann distribution, Restricted Boltzmann Machine [zh]
Pitfalls in Tensorflow [zh]
Reading Notes: MLAPP Chapter 21: Variational Inference
\[\newcommand{\bm}[1]{\boldsymbol{#1}}\]Reading Notes: MLAPP Chapter 11: Latent Linear Models
\[\newcommand{\bm}[1]{\boldsymbol{#1}}\]Reading Notes: MLAPP Chapter 10: Mixture Models and the EM Algorithm
\[\newcommand{\bm}[1]{\boldsymbol{#1}}\]Finetune Pre-trained LMs
Over the weekend, I played with fine-tuning GPT-2 and XLNet (on Colab). Super applause to Huggingface Transformers, it makes all sorts of pre-trained LM extremely accessible. The framework has evolved a lot from a wrapper of pre-trained BERT. It now unifies all models with AutoModel* with different capabilities, so that we only have to know the key and not care about the API. The repo also contains very handy fine-tuning and inference scripts.
We won Terminal Live @ UIUC
Our team TOAD ranked #1 in Terminal Live @ UIUC, sponsored by Correlation One and Citadel. We will be sharing a cash prize of $12,000!
Phrase Grounding by Soft-Label Chain Conditional Random Field (EMNLP-IJCNLP 2019 Long Paper)
Our paper Phrase Grounding by Soft-Label Chain Conditional Random Field is accepted as long paper in EMNLP-IJCNLP 2019! arXiv link
ICPC World Finals 2019 in Porto, Portugal
ICPC Bytedance-Moscow Workshop 2019 in Beijing, China
DFTnet: efficiently training large neural networks
Recently I played with neural networks, changing the matrix multiplication in NN’s propagation into a convolution, with FFT to speed up computation. This architecture allows for training neural networks with larger layer sizes, given that we allow weights to be reused in a certain way. Preliminary experiments shows 93% accuracy on MNIST dataset.
360 Depth Correction: depth correction for virtual objects enclosed by 360 video
In virtual reality, when a 360 monocular video canvas surrounds virtual objects, there will be depth mismatch that creates artifacts. In this scenario, monocular depth cues provided by the canvas will override binocular depth cues on the virtual object. In this paper, I propose an algorithm to geometrically transform the virtual object in order to compensate for the mismatch. This allows natural fusion of virtual objects and 360 environments in virtual reality.
subscribe via RSS