Looking for translations, or for help with translations and transliterations? This is the place.
Leo Rivers
Joined: Sun Jul 17, 2011 4:52 am

OCR documents that contain transliteration of Sanskri

Post by Leo Rivers » Thu Apr 26, 2012 2:47 pm


I am trying to OCR some documents that contain transliteration of Sanskrit, meaning that they are Latin characters with a set of about 10 diacritic marks above and below them, which are not standard.

In order to make my OCR software (ReadIris Pro) learn the right characters, I need advice.

First, have I the correct product to do what I need to do?
"The following brief statement regarding Vasubandhu’s religious view is limited to information obtained from his Sūtra-commentaries translated into Chinese by Bodhiruai during the first half of the sixth century A.D. These include T.1519 (SPU), T.1522 (Daśabhūmika-sūtra-śāstra) & T.1524 (SU), T. 1525 (Gayāśirṣa-sūtra-tīkā), T.1526  (Ratnācuḍaparipṛcchā-sūtra-catardharma-upadeśa), T.1532
(Viśeṣacintiparipṛcchā-sūtra-upadeśa) and T.1533 (Dharma-cakrapravartana-sūtra-upadeśa)."

kirtu
Joined: Mon Jan 18, 2010 5:29 pm
Location: Baltimore, MD

Re: OCR documents that contain transliteration of Sanskri

Post by kirtu » Thu Apr 26, 2012 4:33 pm

Well, when you scan the document does it scan correctly?

Joined: Thu Aug 23, 2012 11:16 am
Location: Perth

Re: OCR documents that contain transliteration of Sanskri

Post by Kaji » Fri Aug 24, 2012 12:41 pm

Would installing the Sanserif Pali font (http://www.dharanipitaka.net/2011/2008/ ... nspali.ttf" onclick="window.open(this.href);return false;) help?
Namas triya-dhvikānāṃ sarva tathāgatānām!

Joined: Tue Jul 03, 2012 2:39 am
Location: USA

Re: OCR documents that contain transliteration of Sanskri

Post by viniketa » Fri Aug 24, 2012 1:08 pm

Leo - I'm not familiar with Readiris, I typically use ABBYY Fine Reader (and I use a PC). However, you likely need a 'language pack' for Readiris that includes all Extended Latin characters. English language does not. French, Spanish, German do.

Hope this helps.

