TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK	REMOVE
Few-Shot Learning	MedConceptsQA	johnsnowlabs/JSL-MedMNX-7B	Accuracy	25.627	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/medconceptsqa-open-source-medical-concepts-qa/few-shot-learning-on-medconceptsqa)](https://paperswithcode.com/sota/few-shot-learning-on-medconceptsqa?p=medconceptsqa-open-source-medical-concepts-qa)`

MedConceptsQA: Open Source Medical Concepts QA Benchmark

12 May 2024 · Ofir Ben Shoham, Nadav Rappoport ·

We present MedConceptsQA, a dedicated open source benchmark for medical concepts question answering. The benchmark comprises of questions of various medical concepts across different vocabularies: diagnoses, procedures, and drugs. The questions are categorized into three levels of difficulty: easy, medium, and hard. We conducted evaluations of the benchmark using various Large Language Models. Our findings show that pre-trained clinical Large Language Models achieved accuracy levels close to random guessing on this benchmark, despite being pre-trained on medical data. However, GPT-4 achieves an absolute average improvement of nearly 27%-37% (27% for zero-shot learning and 37% for few-shot learning) when compared to clinical Large Language Models. Our benchmark serves as a valuable resource for evaluating the understanding and reasoning of medical concepts by Large Language Models. Our benchmark is available at https://huggingface.co/datasets/ofir408/MedConceptsQA

PDF Abstract