The Watson speech to text program is IBM's natural languaging processing computer system to transcribe audio. It's not a consumer end user application, and it's enterprise level system designed to be used via APIs and code.
Businesses can use it to convert hours of audio into text very quickly. This guide will go over what functionalities and benefits it can bring to you.
Also To Note: Watson text to speech is another separate application to convert written words to realistic speech, which you can learn more here
Plan and Pricing
The free plan Lite is per month is 500 minutes of audio, but if you want to process more than this amount, there is the Plus plan that offers as low as $0.01 per minute.
The Premium version requires contacting sales, which provide big companies with more capacity and data protection.
Lite | Plus | Premium | Deploy Anywhere | |
---|---|---|---|---|
Pricing Per Month | Free | $0.02/min | Contact sales | Contact sales |
Minutes per month | 500 | Unlimited | Unlimited | Unlimited |
Concurrent transcriptions | 0 | 100 | Unlimited | Unlimited |
Pre trained speech models | 38 | 38 | 38 | 38 |
Speech customization and training | Cell | |||
Noise detection and speaker diarization | Cell | |||
Numeric redaction, smart formatting, word spotting and filtering | Cell | |||
Data isolation, end to end encryption and HIPPA ready | Cell | Cell | ||
Run on any cloud, including IBM, Amazon, Google, Microsoft or on-premises | Cell | Cell | Cell |
Main Features - What It Can Do For You
Out of all audio to text transcription software, the IBM watson speech to text is a enterprise level system that offers 5 major benefits to businesses beyond just accurate audio transcription and speech recognition.
- Watson speech to text can be integrated with interactive voice response call system via IBM voice agent - For example, answer common call queries
- Mining conversation logs to ID call patterns, collects complaints, sentiment and more - Call analytics
- AI powered agent assist to quickly transcribe the audio and search relevant data within seconds - For example ID verification
- Cloud computing - Everything is done on the cloud
- Data protection for large organization
Business Applications That Can Benefit From Watson Speech To Text
The IBM Watson Speech to Text platform is the next logical step for any organization looking at integrating speech recognition technology into their business.
- Various sizes of call centers using IVR system
- Technical support system
- Billing
- CRM
- Machine learning to improve customer interactions
- Liabilities detection
- Predict industry disruption
Audio Transcription
Convert audio to text - IBM Watson speech to text automatically pick up a speaker's voice and accurately transcribe what's being said into text.
Noise Reduction - The system automatically filters out background noises to isolate high fidelity audio to transcribe, and minimize error
Multi-Speaker detection - Yes, the system can identify up to 6 different speakers at the same time, and transcribe them all - This is called the speaker diarization feature
Filter Words and Content - Professionals can use keyword spotting feature to detect words and phrases whether to keep or remove.
Improved Accuracy - The software can recognize various vocabulary that appeal to a broad audience. In addition Watson also supports grammar functionality for all the languages it recognizes
Language Support
- Arabic (Modern Standard)
- Chinese (Mandarin)
- Czech
- Dutch (Belgian and Netherlands)
- English (Australian, Indian, United Kingdom, and United States)
- French (Canadian and France)
- German
- Hindi (Indian)
- Italian
- Japanese
- Korean
- Portuguese (Brazilian)
- Spanish (Castilian and Latin American)
API Support
IBM's Watson Speech to Text service enables developers and businesses with an API key, the ability for transcribing spoken audio in different languages.
The API can be used in mobile phone apps, website apps and Python coding projects.
Check out Watson API SDK here
Build A Speech To Text Service Using The Watson STT API
This code imports the Watson PHP SDK and uses it to create a Speech to Text service client. Then it sends an audio file to the service for transcription and prints the transcript if the request was successful.
Copy the following PHP code:
<?php
// import the Watson PHP SDK
require_once 'vendor/autoload.php';
use WatsonSDK\Services\SpeechToText;
use WatsonSDK\Common\WatsonCredential;
// replace with your Watson Speech to Text API key
$apiKey = 'YOUR_API_KEY';
// create a new Watson Speech to Text service client
$speechToText = new SpeechToText(WatsonCredential::initWithAPIKey($apiKey));
// replace with the path to your audio file
$audioFile = 'path/to/audio.mp3';
// send the audio file to the Watson Speech to Text service for transcription
$response = $speechToText->recognize(
$audioFile,
['content_type' => 'audio/mp3']
);
// check for errors
if ($response->getStatusCode() == 200) {
// if successful, get the transcript
$transcript = $response->getContent()['results'][0]['alternatives'][0]['transcript'];
// print the transcript
print($transcript);
}
else {
// if there was an error, print the error message
print($response->getMessage());
}
?>