no code implementations • Findings (EMNLP) 2021 • Kezhen Chen, Qiuyuan Huang, Daniel McDuff, Xiang Gao, Hamid Palangi, JianFeng Wang, Kenneth Forbus, Jianfeng Gao
Based on these annotations, we define two different tasks for the NICE dataset.
1 code implementation • 13 Feb 2024 • Chongyang Gao, Kezhen Chen, Jinmeng Rao, Baochen Sun, Ruibo Liu, Daiyi Peng, Yawen Zhang, Xiaoyuan Guo, Jie Yang, VS Subrahmanian
In this paper, we introduce a novel parameter-efficient MoE method, \textit{\textbf{M}oE-L\textbf{o}RA with \textbf{L}ayer-wise Expert \textbf{A}llocation (MoLA)} for Transformer-based models, where each model layer has the flexibility to employ a varying number of LoRA experts.
no code implementations • 7 Sep 2023 • Jiaying Lu, Jinmeng Rao, Kezhen Chen, Xiaoyuan Guo, Yawen Zhang, Baochen Sun, Carl Yang, Jie Yang
Large Vision-Language Models (LVLMs) offer remarkable benefits for a variety of vision-language tasks.
no code implementations • 19 Aug 2023 • Diji Yang, Kezhen Chen, Jinmeng Rao, Xiaoyuan Guo, Yawen Zhang, Jie Yang, Yi Zhang
Visual language tasks require AI models to comprehend and reason with both visual and textual content.
no code implementations • 31 May 2023 • Xiaoyuan Guo, Kezhen Chen, Jinmeng Rao, Yawen Zhang, Baochen Sun, Jie Yang
To train LOWA, we propose a hybrid vision-language training strategy to learn object detection and recognition with class names as well as attribute information.
2 code implementations • ICML 2020 • Kezhen Chen, Qiuyuan Huang, Hamid Palangi, Paul Smolensky, Kenneth D. Forbus, Jianfeng Gao
The encoder of TP-N2F employs TPR `binding' to encode natural-language symbolic structure in vector space and the decoder uses TPR `unbinding' to generate, in symbolic space, a sequential program represented by relational tuples, each consisting of a relation (or operation) and a number of arguments.
no code implementations • 25 Sep 2019 • Kezhen Chen, Qiuyuan Huang, Hamid Palangi, Paul Smolensky, Kenneth D. Forbus, Jianfeng Gao
Generating formal-language represented by relational tuples, such as Lisp programs or mathematical expressions, from a natural-language input is an extremely challenging task because it requires to explicitly capture discrete symbolic structural information from the input to generate the output.
no code implementations • 19 Aug 2015 • Kuan-Ting Chen, Kezhen Chen, Peizhong Cong, Winston H. Hsu, Jiebo Luo
To answer this question, we design a novel system that consists of three major components: (1) constructing a large dataset from the New York Fashion Shows and New York street chic in order to understand the likely clothing fashion trends in New York, (2) utilizing a learning-based approach to discover fashion attributes as the representative characteristics of fashion trends, and (3) comparing the analysis results from the New York Fashion Shows and street-chic images to verify whether the fashion shows have actual influence on the people in New York City.