With a Raspberry Pi, Wolf Paulus made a language translator using the on device sphinxbase / pocketsphinx open source speech recognition toolkit. While it worked well, it has limited accuracy and vocabulary, Wolf needed to use some other service if he wanted to take his project a step further.
Google’s Speech Recognition Service was what was called upon. It needs an FLAC (Free Lossless Audio Codec) encoded voice sound file. Also, accessing Google’s speech recognition service warrants an API key which is gettable through the Google Developers Console. After going through the directions for Chromium Developers, Wolf managed to successfully create his ‘Pi Translator’.
Wolf acquired text translation from Microsoft. Next, he created an operational process to implement language translation which works like this. First, the maker records their voice. The FLAC encodes it, and sends it to Google for transcription. Now, the maker will take help of Google’s Speech Synthesizer to create a recognizable utterance. By Microsoft’s translation service a translation of the transcription into the target language. With Google’s Speech Synthesizer synthesize the translation in a target language.
You can watch a demonstration of the audio language translator. It is extremely effective. More efective than the speech recognition systems which come inbuilt in most smartphones.
Watch Video : https://vimeo.com/123657124
No comments:
Post a Comment