Overview
The Asynchronous Speech-to-Text API delivers high-quality transcription for pre-recorded audio.
attention
For a streaming solution, refer to the Streaming Speech-to-Text API documentation.
API endpoint
The base URL for this version of the API is https://api.rev.ai/speechtotext/v1
. All endpoints described in this documentation are relative to this base URL.
attention
The base URL for this version of the API differs for non-US deployments. Users working with non-US deployments should obtain the correct base URL for their deployment from the Rev AI global deployments page.
warning
The base URL is different from the base URL for the Streaming Speech-to-Text API.
Authentication
Clients must authenticate by including their Rev AI access token in the Authorization:
header of their requests. If the access token is invalid or the header is not present, a 401
error code will be returned.
Turnaround time and chunking
Chunking is the act of breaking audio files into smaller segments. Rev AI uses this method to decrease turnaround time of audio files greater than 3 minutes in length.
Often, especially for AI transcription jobs involving shorter files, your transcript will be ready in 5 minutes or less. It generally takes no longer than 15 minutes to return longer audio files.
The expected turnaround time is 12 to 24 hours for human transcription jobs.
attention
If you require faster turnaround time, please contact the support team at support@rev.ai.
File formats
The Asynchronous Speech-to-Text API supports all the file formats supported by FFmpeg. This includes common media formats such as MP3, MP4, Ogg, WAV, PCM and FLAC and many more.
API limits
The following default limits apply per user, per endpoint for the Asynchronous Speech-to-Text API:
- 10,000 transcription requests submitted every 10 minutes.
- 500 transcriptions processed every 10 minutes. Any submissions over this will be accepted but put into a queue and not started until the next interval.
- Maximum audio duration of 17 hours.
-
File uploads submitted as
multipart/form-data
requests to the/jobs
endpoint have a concurrency limit of 5 and a file size limit of 2 GB per request. -
File uploads via the Rev AI dashboard or using the
source_config
job parameter have a file size limit of 5 TB.
POST requests to the /jobs
endpoint that use the source_config
property do not have a concurrency limit or file restriction. They are only limited by the first three limits specified above.
attention
These default limits are configurable by Rev AI support. To adjust these limits, contact the support team at support@rev.ai.
HIPAA compliance
The API supports HIPAA-compliant processing. However, this feature is not activated by default and must be explicitly activated at account level. Learn more about Rev AI's HIPAA compliance and how to HIPAA-enable a Rev AI user account.
The API has the following limitations in HIPAA context:
-
When submitting a media file, the
media_url
option is not supported. Instead, thesource_config
option must be used. -
Human transcription (
transcriber=human
) is not supported.
Error codes
The API indicates failure with 4xx
and 5xx
HTTP status codes. 4xx
status codes indicate an error due to the request provided (for example, a required parameter was omitted). 5xx
error indicate an error with Rev AI's servers.
The following table lists common 4xx
error codes and troubleshooting suggestions for each:
Status Code | Error | Troubleshooting |
---|---|---|
401 | Authorization Denied | Verify that the access token is valid. |
403 | Access Denied to Deployment | Verify that the API endpoint in use is correct for your selected deployment or region. |
405 | Invalid Job Properties | Verify that the job parameter names and values are valid. |
404 | Resource Not Found | Verify that the job identifier is valid. |
406 | Unsupported Output Format | Verify that the requested output format is valid. |
409 | Invalid Job State | Verify that the job has completed processing. |
413 | Payload Too Large | Verify that the submitted file size is within the allowed API limits. |
When a 4xx
error occurs during invocation of a request, the API responds with a problem details HTTP response payload.
Some errors can be resolved simply by retrying the request. The following error codes are likely to be resolved with successive retries.
Status Code | Error |
---|---|
409 | Invalid Job State |
429 | Too Many Requests |
502 | Bad Gateway |
503 | Service Unavailable |
504 | Gateway Timeout |
attention
With the exception of the 429
status code, it is recommended that the maximum number of retries be limited to 5 attempts per request. The number of retries can be higher for 429
errors but if you notice consistent throttling, please contact the support team at support@rev.ai.
Error object
The problem details information is represented as a JSON object with the following optional properties:
Property | Description |
---|---|
type |
A URL representing the type for the error |
title |
A short human readable description of type |
details |
Additional details of the error |
status |
HTTP status code of the error |
In addition to the properties listed above, the problem details object may list additional properties that help to troubleshoot the problem.
Here is an example of a response to a job request with a missing required parameter:
// Bad Submit Job Request
{
"parameter": {
"source_config.url": [
"The url field is required."
]
},
"type": "https://www.rev.ai/api/v1/errors/invalid-parameters",
"title": "Your request parameters didn't validate",
"status": 400
}
Here is an example of a response to an invalid job:
// Invalid Transcript State
{
"allowed_values": [
"transcribed"
],
"current_value": "in_progress",
"type": "https://rev.ai/api/v1/errors/invalid-job-state",
"title": "Job is in invalid state",
"detail": "Job is in invalid state to obtain the transcript",
"status": 409
}
Billing
For billing purposes, files are charged per second, with a minimum charge of 15 seconds.
Here are some examples:
- 4-second files are charged as 15 seconds.
- 14.1-second files are charged as 15 seconds.
- 15-second files are charged as 15 seconds.
- 16-second files are charged as 16 seconds.
- 16.1-second files are charged as 17 seconds.
- 22.7-second files are charged as 23 seconds.
Human transcription files are charged per second with a minimum charge of 1 minute.