|
|
|
22.10.2008 14:56 |
Scott |
Hi all
Right - I have a couple of weird problems and hope that you can help
ok - I am writing an application that takes in a wave file and extracts only the speech from it - at this stage, not doing any speech to text - literally generating a series of wave files of speech and non-speech.
If I run the speech recogniser method, once returned, I can do a writetowavefile on the result and hey presto, I get half of what I am after. The engine takes speech, continues processing until it reaches the next speech block after any non-speech (music for example) So I have speech+non-speech in a single wav file - that is half way there.
So, I figured, if I use a babbletimeout to return an error when it reaches the non-speech part and then grab the audio.audioposition.millisecond - this will tell me how far into the speech I am. Makes sense yes?
Unfortinatly, this is not the case and I am stuck as to why. The result is that the lovely splitting has been disrupted so I get chops in the middle of a sentence even with high babbletimeouts and the audio.audioposition properties are all ZERO.
Now, the weirder thing is that if I look in the speechdetected event and look at the audioposition of the file in the event, they too are all zero
have any of you fantastic people out there got any clue why this might be?????
(I am somewhat stuck without it)
Finally, one mroe thing - is there any way of doing the speech detection stage without doing the speech recognition because I could speed up my recognition by missing out what I don't need to recognise (I need to run multiple recognitions with multiple cultures)
Thanks
Scott |
 |
|