Some areas require transcriptions such as captioning, which is a process that requires a video transcription. But also the secretarial, legal, research, etc. community. And, unfortunately, this is a monstrous time consuming task and not the most interesting in the world, either. To counter this, you may be wondering if it is possible to automate the task? The answer is yes! There are more and more new ways to make an automatic transcription, to your great delight. We will explain them to you.

How do I make an audio transcription automatically?

To make an audio transcription you can do the work by hand or through applications that allow you to do it automatically.


These applications use voice recognition and voice input techniques. Speech recognition works by automatically recognizing speech and transforming it into text, which is called a transcription.


You may need automated transcripts for a variety of reasons: need to subtitle a video, need meeting minutes, need trial minutes... Transcripts are useful in a variety of areas from marketing to legal to secretarial work.


There are two ways to make an automatic transcription: either use an application directly to dictate what you want to transcribe, or record the data to be transcribed beforehand and then use an application.

Which applications to use?


Making a transcription by hand requires a lot of work since the audio has to be transcribed word for word. Fortunately, there are tools to automate this process. Here is a comparison of tech applications that allow you to make automatic transcriptions:


  1. Google Docs:

Google docs offers a voice input feature that allows you to dictate directly the words you want to transcribe.


Advantages: - to be able to make an instant transcription


Disadvantage: - can be used for note-taking instead

  1. YouTube:

You can use Youtube to transcribe a video. To do this you must host your video on the platform then go to Creator Studio and create subtitles. You can use the automatic subtitling system then edit your subtitles and copy the text to get your transcription!


Advantages: - Automatically subtitle 

  • Convenient if you want to publish your video on Youtube
  • Subtitles are "open caption" which allows you to integrate several subtitle tracks on the same video and therefore translations.


Disadvantage: - Obligation to host your video on Youtube

  • Poor transcription
  • Complex tool when first using subtitling
  • No possibility to stylize your subtitles


  1. Got it:


Capté is a tool made in France that simplifies the subtitling process. Easier to use than Youtube, it allows you to make quality transcriptions of your video and audio content. 

The automatic step-by-step transcription process :

  1. Go to the web application Captured 
  2. Upload the video you want to subtitle, all your videos are easily found in a gallery on your profile
  3. Subtitles appear automatically thanks to artificial intelligence. A few mistakes are made and they need to be reworked, but this provides a consistent basis. 
  4. Translate automatically into a language if needed, 5 languages are available: German, Spanish, English, Italian and Simplified Chinese.
  5. Once your text has been edited and can be translated, simply copy and paste it into a text file: your transcription is ready.


Tip: Make sure you have clear audio, avoid music for example, so that the speech recognition is as efficient as possible to do the transcription.


Advantages :

  • As easy to use as Word or Google docs
  • so that's a big time saver,
  • Qualitative automatic transcription
  • All formats are accepted
  • Possibility to stylize your subtitles
  • Machine translation in 5 languages
  • Ease of use
  • Can be used from the phone
  • French interface
  • Automatic subtitling with direct subtitle insertion into your videos or available SRT file



  • The transcription, even if it is one of the best on the market, is not perfect.
  • Translation only available in 5 languages


And of course, Capté has a free offer to allow you to make unlimited transcriptions.


What is voice recognition?


Speech recognition is used by most of these applications. This system recognizes human voices and then transcribes the audio into text, which is voice input. Today, voice input is used many times in everyday life without attention being paid to it. It is used with voice assistants such as Siri, Cortana, Alexa... or when typing an oral sms. Voice input is also used to enter queries in search engines. Voice input is used because algorithms need text to process information. 


Speech recognition is improving a little more every day thanks to the learning machine. So in a few years, your automatic transcriptions will surely be even better than today. 


Why do transcripts?


Making a transcript can be of several interests:

  • To be able to share what has been said
  • To memorize
  • To get a paper trail
  • To enable accessibility for people who are deaf and hard of hearing
  • In order to be able to do subtitling
  • To be able to analyze the statements (example: semiotics, semantics...)


Transcribing audio or video has a multitude of advantages. Transcripts are increasingly in demand, particularly to ensure accessibility for all. We are therefore seeing more and more podcasts transcribed or videos with subtitles. It is in this context that tools such as Capté are being developed to facilitate this task and allow everyone to make a transcription even without technical knowledge.