I'm a research scientist at ByteDance. Previously, I was a Machine Learning PhD at Georgia Tech supervised by Prof.Faramarz Fekri and Prof.Le Song. I received my MS in Computational Data Science from CMU in 2017 and BEng in Software Engineering from Beihang Univ. in 2016.
My research focuses on LLM, logic reasoning, and interpretable and data-efficient ML models:
PhD in Machine Learning, 2024
Georgia Institute of Technology
MS in Computational Data Science, 2017
Carnegie Mellon University
BEng in Software Engineering, 2016
Beihang University
LLMs nowadays can process multimodal data, long documents, use tools, and browse web. Can we integrate all these and make a language model OS? Where, the LLM acts as a CPU that processes data stored in a context window (RAM).
We argue the the key challenge towards LM OS is managing the life-long context and ensuring statefulness across sessions. To address this, we introduce compressor-retriever, a model-agnostic architecture designed for life-long context management. Our approach exclusively uses the base model's forward function to compress and retrieve context, ensuring end-to-end differentiability. Preliminary experiments demonstrate the effectiveness of this architecture in in-context learning tasks, marking a step towards the development of a fully stateful LLM OS.
Many works show LLMs are good at reasoning, but is it true? Can the LLMs reason complex and ambiguous programs using program? It turns out the answer is no.
We introduce the reasoning in the wild task, where an LLM is tasked to solve a reasoning problem of unknown type by identifying the sub-problems and their corresponding formalisms, and writing a program to solve each sub-problem, guided by a tactic.
[Paper] [Github] [Data] [Model]
One of the major bottlenecks for logic-based NLP systems is the lack of a reliable translation model that maps natural language (NL) to the corresponding first-order logic (FOL) representation.
We approach this longstanding challenge by harnessing the power of LLMs. We create a high-quality sentence-level NL-FOL pair dataset (MALLS) from GPT-4. We then propose an SFT+RLHF framework that finetunes LLaMA models for NL-FOL translation task. The resulting model, namely LogicLLaMA, achieves GPT-4 level performance.
[Paper] [Github] [Dataset] [Weights]
Temporal data such as video and driving logs are widely studied in tasks such as video understanding and autonomous driving.
We develop a reasoning framework that detects inductive patterns in temporal data via neural-logic methodology. The framework aims to assist the training of modern ML models by inducing patterns for accurate grounding with fewer data. For example:
Modern ML models can achieve amazing performance in many tasks. However, they require a large amount of labeled data to train and their outputs are not self-explanatory for human users.
We study this problem for graph reasoning models. We propose a learning-by-asking framework, namely LogicQA, that trains the model by interactively asking questions to an oracle. Under the hood, verified questions are used to label the data automatically, leading to order of magnitude better data efficiency.
Deep vision models are successfully employed in many applications, but they are vulnerable to adversarial examples. Existing defense methods are either limited to specific attacks types or are too complex for practical models.
To this end, we propose logic adversarial defense, a framework that utilizes the scene graph of the image to detect object labels that are out-of-place in the context. Our framework is model-agnostic and effective against localized attacks such as adversarial patch. Moreover, it produces human-readable explanations as to why the system is attacked.
Deductive and inductive reasoning are the critical tasks for many knowledge graphs applications. The former learns to infer new facts using the existing knowledge; the latter summarizes (or induces) the knowledge using the existing facts. We studied and proposed GNN- and logic-based models to address these issues respectively.
* equal contribution
Y. Yang, S. Xiong, E. Shareghi and F. Fekri. The Compressor-Retriever Architecture for Language Model OS, arXiv, 2024.
Y. Yang, S. Xiong, A. Payani, E. Shareghi and F. Fekri. Can LLMs Reason in the Wild?, arXiv, 2024.
Y. Yang, S. Xiong, A. Payani, E. Shareghi and F. Fekri. Harnessing the Power of Large Language Models for Natural Language to First-Order Logic Translation. ACL, 2024.
Y. Yang, S. Xiong, F. Fekri, J. C. Kerce, and A. Payani. LogicDP: Creating Labels for Graph Data via Inductive Logic Programming, 11th International Conference on Learning Representations (ICLR 2023).
Y. Yang, J. Clayton, and F. Fekri. LogicDef: An Interpretable Defense Framework Against Adversarial Examples via Inductive Scene Graph Reasoning. Thirty-Sixth AAAI Conference on Artificial Intelligence (AAAI 2022), oral presentation.
Y. Yang, and S. T. Piantadosi. One model for the learning of language. Proceedings of the National Academy of Sciences Feb 2022, 119 (5) (PNAS).
Y. Yang, and L. Song. Learn to Explain Efficiently via Neural Logic Inductive Learning, 8th International Conference on Learning Representations (ICLR 2020).
Y. Zhang*, X. Chen*, Y. Yang*, A. Ramamurthy, B. Li, Y. Qi, and L. Song. Efficient Probabilistic Logic Reasoning with Graph Neural Networks. 8th International Conference on Learning Representations (ICLR 2020).
X. Si*, Y. Yang*, H. Dai, M. Naik, and L. Song. Learning a Meta-Solver for Syntax-Guided Program Synthesis. 7th International Conference on Learning Representations (ICLR 2019).
Y. Yang, P. Xie, X. Gao, C. Cheng, C. Li, H. Zhang and E. Xing. Predicting Discharge Medications at Admission Time Based on Deep Learning. arXiv preprint, 2017.
Y. Yang, J. Yu, Y. Hu, X. Xu and E. Nyberg. A Consumer Health Question Answering System. Text Retrieval Conference 2017 LiveQA Medical Track (TREC 2017).
Y. Yang, J. Chen and J. Zhu. Distributing the Stochastic Gradient Sampler for Large-Scale LDA. 22nd Conference on Knowledge Discovery and Data Mining (KDD 2016).