Search Results for author: Jingcheng Hu

Found 3 papers, 2 papers with code

Common 7B Language Models Already Possess Strong Math Capabilities

no code implementations • 7 Mar 2024 • Chen Li, Weiqi Wang, Jingcheng Hu, Yixuan Wei, Nanning Zheng, Han Hu, Zheng Zhang, Houwen Peng

This paper shows that the LLaMA-2 7B model with common pre-training already exhibits strong mathematical abilities, as evidenced by its impressive accuracy of 97. 7% and 72. 0% on the GSM8K and MATH benchmarks, respectively, when selecting the best response from 256 random generations.

GSM8K Math

Paper
Add Code

FP8-LM: Training FP8 Large Language Models

1 code implementation • 27 Oct 2023 • Houwen Peng, Kan Wu, Yixuan Wei, Guoshuai Zhao, Yuxiang Yang, Ze Liu, Yifan Xiong, Ziyue Yang, Bolin Ni, Jingcheng Hu, Ruihang Li, Miaosen Zhang, Chen Li, Jia Ning, Ruizhe Wang, Zheng Zhang, Shuguang Liu, Joe Chau, Han Hu, Peng Cheng

In this paper, we explore FP8 low-bit data formats for efficient training of large language models (LLMs).

463

Paper
Code

Revealing the Dark Secrets of Masked Image Modeling

1 code implementation • CVPR 2023 • Zhenda Xie, Zigang Geng, Jingcheng Hu, Zheng Zhang, Han Hu, Yue Cao

In this paper, we compare MIM with the long-dominant supervised pre-trained models from two perspectives, the visualizations and the experiments, to uncover their key representational differences.

Ranked #3 on Depth Estimation on NYU-Depth V2

Inductive Bias Monocular Depth Estimation +3

154

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.