Product List
-
Short Speech-to-Text
It can convert a speech with a duration of fewer than 60 seconds to characters. It is applicable for mobile speech input, intelligent speech interaction, speech command, and speech search.
-
Real-time Speech-to-Text
It can convert the audio stream into characters and return each sentence's start and end time. It is applicable for such scenarios as long-sentence speech input, audio and video subtitles, and meetings record.
-
Audio File Transcription
It can convert the audio files uploaded in batch into characters and return the recognition results within 12 hours. It is applicable for such scenarios as record quality check, audio content analysis.
-
Call Center Solution
The end-to-end speech technology solution adopted for the call center scenario includes speech-to-text at an 8K sampling rate, speech synthesis. It helps enterprises access the call center’s speech capability more efficiently.
-
Speech Self-training Platform
With the professional text for business scenarios, it can train language model with zero code. It can recognize the speech content more precisely and effectively, improving the recognition accuracy in the business field.
-
Speech Wake-up
It supports the wake-up by a specific speech command. During the wake-up. It allows you to customize several wake-up words, ensuring natural and smooth conversation for your application.
-
Online Text-to-Speech
It offers highly anthropomorphic, smooth, and natural language synthesis services. It meets the speech broadcast requirements for reading application, purchase order broadcast, and intelligent hardware.
-
Offline Text-to-Speech
In an environment without or with weak internet access, it allows you to perform the speech broadcast on intelligent hardware devices. It can synthesize the characters into an audio file and give you a stable, consistent, and natural speech synthesis experience.
-
Speech Translation
By integrating the high-precise speech-to-text, text translation, and text-to-speech, it provides developers with on-line real-time speech translation capability. It supports four languages, i.e., Chinese, English, Japanese, and Cantonese.
Application Scenarios
-
Speech Search
It allows you to input the search contents by means of speech. It is applied in such search scenarios as web search, vehicular search, and mobile search, freeing your hands and making the search more efficient. It is applicable for many industries, including video websites, intelligent hardware, and mobile manufacturers.
-
Speech Command
It allows you to give commands to your device or software for control and operation using speech, without any manual operation. It is applicable for many fields, including intelligent hardware, vehicular systems, robots, mobile APPs, and games.
-
Live Video Subtitle
As a new live video broadcast means, the speech contents delivered by the host can be transcribed into subtitles on the screen, or it allows you to edit the subtitles.
-
Audio Content Analysis
It can convert the audio speech records into characters and perform continuous analysis and monitoring. Thus, it allows you to identify any risks and illegal contents and exploit potential marketing opportunities.
-
Book Content Broadcast
Text-to-speech technology empowers the reading APPs with the broadcasting abilities, freeing the users’ hands and eyes. Several kinds of special voices give every story a proper tone, bringing the users a more exquisite reading experience.
-
Purchase Order Broadcast
It is applied for such scenarios as car-hailing software, restaurant reservation number calling, and queuing software. Through the text-to-speech, it can perform the purchase order broadcast, helping the users to receive the notification timely and conveniently.
Special Advantages

The speech-to-text can support post-processing capabilities, such as punctuation mark, number format conversion and time stamp processing. The text-to-speech allows you to set the speed, tone, and volume flexibly and mark the polyphones, meeting the personalized requirements.

It features enterprise-level stable service guarantee, professional server clusters carrying with efficient and flexible huge traffic concurrence, and 99.9% service stability guarantee.

The speech-to-text supports the self-training of language models on the speech self-training platform. You can upload the professional texts in your business area, the zero-code training is done automatically. Generally it can improve the identification rate of the words in business fields by 5-25%.

It offers several calling methods, including REST API, websocket API, Android, iOS, and Linux SDK, and text-to-speech offline SDK. It is applicable for different terminal requirements.
Relevant Recommendations
-
Text Review
On the basis of NLP technologies, it can identify text contents about porn, terrorism, politics, malicious advertisement, abuse, and illegal articles. It supports you in customizing the black and white lists. It allows you to adjust the review strategy and strictness flexibly.
-
Video Content Review
With respect to the video contents, it can perform the intelligent review from several dimensions. The review contents include porn, violence, terrorism, politics, advertisement, and user-defined black library. It helps you with the review of the contents on your platform.
-
Application Technology for Natural Language Processing
Oriented by the multi-scenario technical application, it offers the NLP technical abilities that can be applied for product strategy directly. So, it allows your products to better understand languages and users.