Speech Recognition Woes: Sphinx4 Configuration file : config.xml

There are three primary modules in the Sphinx-4 framework:

The FrontEnd
The Decoder
The Linguist

Sphinx4 is very modular in nature. Every small block here can be separately configured. All these blocks can be separately configured in a Configuaration File. In this file we need to specify the front end which sphinx4 will use, Acoustic models and dictionary used to create a search graph which is used during recognition, language model(grammar) makes recognizer look for 'most likely' words occuring during recognition. Sphinx-4 Decoder block use output from the FrontEnd in conjunction with the SearchGraph (output) from the Linguist to generate recognition Result.

Let us now walk through a sample Configuration file: config.xml (download here)
Every config file has been logically separated into different sections. You can find syntax and rules for creating a configuration file at Sphinx Configuration management site.

Frequently Used Properties consists of properties that are used by other sections.
In Language Model we specify the grammar to use which will be used by Sphinx to match the speech. Pluggable language model support for ASCII and binary versions of unigram, bigram, trigram, Java Speech API Grammar Format (JSGF), and ARPA-format FST grammars.
Dictionary can be either Wall Street Journal (WSJ) or TIDIGTS or your own dictionary in standard ARPA format. You can find WSJ and TIDIGITS dictionary in Sphinx4 binaries itself. Dictionary consists of the words and their pronunciation phenome.
Next define Acoustic Model depending upon the type of Dictionary you use. Again Sphinx4 has included acoustic models for WSJ and TIDIGITS.
In Front End we can specify if the input is from Microphone or any Data Source (wav, au, etc format).

These are the major sections in any Sphinx4 configuration file. I will discuss few of these sections in subsequent articles.

1 comment:

AnonymousDecember 8, 2022 at 1:11 PM
This significantly improves the pliability Glass Teapot Sets but in addition will increase the fee. The CNC drilling machine is often utilized for mass manufacturing. Drilling machines, however, often have a multi-function machining middle that's often mingled and generally twisted. The greatest sink time for CNC drilling is with software modifications, so for velocity, the variation of hole diameter should be reduced. A typical drawback is the way to|tips on how to} arrange your files and do your CAM programming in order that the machine executing your components will work smoothly and efficiently with the data.

Speech Recognition Woes

Sunday, April 26, 2009

Sphinx4 Configuration file : config.xml

1 comment:

No. of People Having trouble in SR

Blog Archive

Followers

About Me

Visitors are From?