1 code implementation • 7 Nov 2023 • Muntabir Hasan Choudhury, Lamia Salsabil, William A. Ingram, Edward A. Fox, Jian Wu
To overcome the challenge of imbalanced labeled samples, we augmented data for minority categories and employed a hierarchical classifier.
1 code implementation • 30 Mar 2023 • Muntabir Hasan Choudhury, Lamia Salsabil, Himarsha R. Jayanetti, Jian Wu, William A. Ingram, Edward A. Fox
Metadata quality is crucial for digital objects to be discovered through digital library interfaces.
2 code implementations • 1 Jul 2021 • Muntabir Hasan Choudhury, Himarsha R. Jayanetti, Jian Wu, William A. Ingram, Edward A. Fox
Our experiments show that CRF with visual features outperformed both a heuristic and a CRF model with only text-based features.
Ranked #1 on Key Information Extraction on ETD500
Key Information Extraction Optical Character Recognition (OCR)
1 code implementation • 23 Jun 2021 • Sampanna Yashwant Kahu, William A. Ingram, Edward A. Fox, Jian Wu
To the best of our knowledge, ScanBank is the first manually annotated dataset for figure and table extraction for scanned ETDs.