Wednesday, June 17, 2009

A Dive into Japanese LVCSR Engine

In previous post, I mentioned on switching to Julius from Sphinx. I had no idea what I had in store. Googling for 'Speech Recognizer' gives you 8 on 10 results for CMU Sphinx. I found Julius from wiki link. I only wanted an Open Source Speech Recognizer. Seems like Julius was the only best possible option for me to dig out if I can make use of it.

I must say using Julius was not easy but end results achieved from Julius were great.
Let me highlight some initial problems you will face when you go for Julius.
  • Julius is an Open Source Japanese Speech Recognition. Julius was developed as Japanese LVCSR since 1997. They have home page both in English and Japanese.
  • The site have a user documentation which is actually first written in Japanese. An English version is still under development. But do not worry, Google Translator comes to our rescue. Here is the translated English version of Julius Book.
  • Now being Japanese Recognizer it had only acoustic model for Japanese Language. But good people are present all over web. The VoxForge-project is working on the creation of an open-source acoustic model for the English language.
  • If you go at the Julius home site, you might get lost after downloading the source code or binaries and reading bits-n-pieces of info. I suggest you to start by downloading Julius Quick Start from Voxforge. Its on 3.5.2 version of Julius but porting to latest version is as easy as copying acoustic model and grammar files.
  • Julius Forum is also a painful experience for me. They have English and Japanese Forum Topics. So again use Google Translator. I don't think whatever is asked by Japanese guys are reflected in English Forum.

The above mentioned points will definitely get you started with Julius especially the Quick start from voxforge. Check out voxforge forums too. Useful information but meagre for a novice.