When VR will work and when it won't...
It sounds so simple: get some voice recognition (or, more accurately, voice-to-text) software, get it set up, then play your audio recordings of dictation, interviews, meetings, focus groups, seminars, etc., into the microphone and let the computer do all the hard work of the actual transcription...
Ah, if only life were that simple.
Sadly, computers aren't at the stage where they can do all the typing for you. Yes, there are instances where they can be a big help, but we are still a long way off before you can let the computer do all the hard work.
Dictation is probably the one area where using voice recognition software will save you a lot of time, but you still have to dictate direct into the computer, or if recording the dictation in advance, make sure you do all the punctuation, layout commands etc.
If you have a modern smart phone, with either Android or Apple operating systems, then you may already have played around with voice recognition with their respective virtual assistants ("Ok Google" for Android and "Siri" for Apple), however, you may not have yet tried dictating using your phone. Now, when I send texts or post messages to social media, most of the time I have dictated what I want to say instead of tapping away at a virtual keyboard. And this same technology can be used for dictation of letters, reports, etc.
Admittedly, it takes some getting used to if you have never done dictation before (and not everyone is comfortable with it), remembering to say "comma", "full stop" (or "period"), "new paragraph", and so on, takes time. Admittedly, these could be inserted after you have dictated the text, but that only works for relatively short pieces. If you had a 10-page report, that you had to go back through and put in all the punctuation, layout etc., then it will become time-consuming and may have been quicker to type it manually in the first place.
You also have to understand that the software won't be 100% accurate, and does take some training. Although you can speak at pretty much your normal speed, it does help to say the words clearly. And it may still get punctuation wrong, e.g. you want a comma inserted and instead the software puts the word "comma" in. But, aside from these problems, it can save a lot of time.
However, where voice recognition doesn't work is when it comes to recordings of interviews, meetings, focus groups, etc. Even the producer of the top-selling voice recognition software, Dragon Naturally Speaking, admit their software isn't designed for multiple participant recordings.
Voice recognition software isn't trained to recognise more than one voice for dictation and, even if it could, the resulting document would be very hard to read. There'd be no punctuation or formatting either, so a simple piece like:
John: Hello, Jane, how are you? I hear you won a big contract yesterday. Well done!
Jane: Hi, John. I am very well thank you. Yes, I did. I was surprised the contract came in to be honest, but I had been working on it for months...
If you played that into voice recognition software and it could understand what was being said, it will come out as:
Hello Jane how are you I hear you won a big contract yesterday well done hi John I am very well thank you yes I did I was surprised the contract came in to be honest but I had been working on it for months...
Again, this isn't so much of a problem with the above, but imagine if that interview had gone on for an hour. You'd have lots of pages made up of one long sentence, which would then need extensive work to get the formatting and punctuation correct. Now think how much worse it would be if it was a five-person focus group?
So, if you have recordings of interviews, focus groups, etc., you are still going to have to transcribe them manually for now. Or, better still, save your time (and tired fingers), and hire someone else to do the transcription work for you.