CommitPack

Introduced by Muennighoff et al. in OctoPack: Instruction Tuning Code Large Language Models

CommitPack is is a 4TB dataset of commits scraped from GitHub repositories that are permissively licensed.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages