Microsoft Speech SDK (en)
 

system.speech


 
system.speech Post Reply
22.10.2008 14:56 Scott

Hi all

Right - I have a couple of weird problems and hope that you can help

ok - I am writing an application that takes in a wave file and extracts only
the speech from it - at this stage, not doing any speech to text - literally
generating a series of wave files of speech and non-speech.

If I run the speech recogniser method, once returned, I can do a
writetowavefile on the result and hey presto, I get half of what I am after.
The engine takes speech, continues processing until it reaches the next
speech block after any non-speech (music for example)
So I have speech+non-speech in a single wav file - that is half way there.

So, I figured, if I use a babbletimeout to return an error when it reaches
the non-speech part and then grab the audio.audioposition.millisecond - this
will tell me how far into the speech I am. Makes sense yes?

Unfortinatly, this is not the case and I am stuck as to why. The result is
that the lovely splitting has been disrupted so I get chops in the middle of
a sentence even with high babbletimeouts and the audio.audioposition
properties are all ZERO.

Now, the weirder thing is that if I look in the speechdetected event and
look at the audioposition of the file in the event, they too are all zero

have any of you fantastic people out there got any clue why this might be?????

(I am somewhat stuck without it)

Finally, one mroe thing - is there any way of doing the speech detection
stage without doing the speech recognition because I could speed up my
recognition by missing out what I don't need to recognise (I need to run
multiple recognitions with multiple cultures)

Thanks

Scott
 
 
 Write Us|  Add to favorites
 
 
 ©2007 TERASENS GmbH. All rights reserved. Copyright Notice