OCR model for the Garo language achieving 93.13% character accuracy.
GaroOCR is an Optical Character Recognition model for Garo (grt_Latn), a Tibeto-Burman language spoken by over one million people in Meghalaya, Northeast India. Built on Florence-2-base-ft and fine-tuned on 80,000 deduplicated image-text pairs combining synthetic renders and curated samples across printed and handwritten font styles, the model achieves 93.13% character accuracy. Developed by MWire Labs as part of the Northeast India OCR initiative to bring document digitization capabilities to underrepresented indigenous languages of the region.
Attribution 4.0 International (CC BY- 4.0)
MWirelabs
Transformers
Transformers
Restricted
Sector Agnostic
25/02/26 12:53:58
0
Attribution 4.0 International (CC BY- 4.0)
© 2026 - Copyright AIKosh. All rights reserved. This portal is developed by National e-Governance Division for AIKosh mission.