OCR documents that contain transliteration of Sanskri

Looking for translations, or for help with translations and transliterations? This is the place.
Post Reply
User avatar
Leo Rivers
Posts: 278
Joined: Sun Jul 17, 2011 4:52 am

OCR documents that contain transliteration of Sanskri

Post by Leo Rivers » Thu Apr 26, 2012 2:47 pm


I am trying to OCR some documents that contain transliteration of Sanskrit, meaning that they are Latin characters with a set of about 10 diacritic marks above and below them, which are not standard.

In order to make my OCR software (ReadIris Pro) learn the right characters, I need advice.

First, have I the correct product to do what I need to do?
Readiris 12.0 (build f3) - i64
Model Name:    Mac Pro
  Model Identifier:    MacPro4,1
  Processor Name:    Quad-Core Intel Xeon
  Processor Speed:    2.66 GHz
System Version:    Mac OS X 10.6.8 (10K549)
  Kernel Version:    Darwin 10.8.0
  Boot Volume:    Macintosh HD

  Version:    12.0
  Last Modified:    6/28/10 8:01 AM
  Kind:    Universal
  64-Bit (Intel):    Yes
  Get Info String:    Readiris 12.0.5 (build 1ef) Copyright 1987-2009, I.R.I.S.

  Location:    /Applications/Readiris Pro 12/Readiris.app


"The following brief statement regarding Vasubandhu’s religious view is limited to information obtained from his Sūtra-commentaries translated into Chinese by Bodhiruai during the first half of the sixth century A.D. These include T.1519 (SPU), T.1522 (Daśabhūmika-sūtra-śāstra) & T.1524 (SU), T. 1525 (Gayāśirṣa-sūtra-tīkā), T.1526  (Ratnācuḍaparipṛcchā-sūtra-catardharma-upadeśa), T.1532
(Viśeṣacintiparipṛcchā-sūtra-upadeśa) and T.1533 (Dharma-cakrapravartana-sūtra-upadeśa)."

User avatar
Former staff member
Posts: 5757
Joined: Mon Jan 18, 2010 5:29 pm
Location: Baltimore, MD

Re: OCR documents that contain transliteration of Sanskri

Post by kirtu » Thu Apr 26, 2012 4:33 pm

Well, when you scan the document does it scan correctly?

Kirt's Tibetan Translation Notes

"Even if you practice only for an hour a day with faith and inspiration, good qualities will steadily increase. Regular practice makes it easy to transform your mind. From seeing only relative truth, you will eventually reach a profound certainty in the meaning of absolute truth."
Kyabje Dilgo Khyentse Rinpoche.

"Only you can make your mind beautiful."
HH Chetsang Rinpoche

User avatar
Posts: 242
Joined: Thu Aug 23, 2012 11:16 am
Location: Perth

Re: OCR documents that contain transliteration of Sanskri

Post by Kaji » Fri Aug 24, 2012 12:41 pm

Would installing the Sanserif Pali font (http://www.dharanipitaka.net/2011/2008/ ... nspali.ttf" onclick="window.open(this.href);return false;) help?
Namas triya-dhvikānāṃ sarva tathāgatānām!

User avatar
Posts: 820
Joined: Tue Jul 03, 2012 2:39 am
Location: USA

Re: OCR documents that contain transliteration of Sanskri

Post by viniketa » Fri Aug 24, 2012 1:08 pm

Leo - I'm not familiar with Readiris, I typically use ABBYY Fine Reader (and I use a PC). However, you likely need a 'language pack' for Readiris that includes all Extended Latin characters. English language does not. French, Spanish, German do.

Hope this helps.

If they can sever like and dislike, along with greed, anger, and delusion, regardless of their difference in nature, they will all accomplish the Buddha Path.. ~ Sutra of Complete Enlightenment

Post Reply

Who is online

Users browsing this forum: Djampa and 13 guests