Overview

The Asynchronous Speech-to-Text API delivers high-quality transcription for pre-recorded audio.

attention

For a streaming solution, refer to the Streaming Speech-to-Text API documentation.

API endpoint

The base URL for this version of the API is https://api.rev.ai/speechtotext/v1. All endpoints described in this documentation are relative to this base URL.

attention

The base URL for this version of the API differs for non-US deployments. Users working with non-US deployments should obtain the correct base URL for their deployment from the Rev AI global deployments page.

warning

The base URL is different from the base URL for the Streaming Speech-to-Text API.

Authentication

Clients must authenticate by including their Rev AI access token in the Authorization: header of their requests. If the access token is invalid or the header is not present, a 401 error code will be returned.

Turnaround time and chunking

Chunking is the act of breaking audio files into smaller segments. Rev AI uses this method to decrease turnaround time of audio files greater than 3 minutes in length.

Often, especially for AI transcription jobs involving shorter files, your transcript will be ready in 5 minutes or less. It generally takes no longer than 15 minutes to return longer audio files.

The expected turnaround time is 12 to 24 hours for human transcription jobs.

attention

If you require faster turnaround time, please contact the support team at support@rev.ai.

File formats

The Asynchronous Speech-to-Text API supports all the file formats supported by FFmpeg. This includes common media formats such as MP3, MP4, Ogg, WAV, PCM and FLAC and many more.

API limits

The following default limits apply per user, per endpoint for the Asynchronous Speech-to-Text API:

  • 10,000 transcription requests submitted every 10 minutes.
  • 500 transcriptions processed every 10 minutes. Any submissions over this will be accepted but put into a queue and not started until the next interval.
  • Maximum audio duration of 17 hours.
  • File uploads submitted as multipart/form-data requests to the /jobs endpoint have a concurrency limit of 5 and a file size limit of 2 GB per request.
  • File uploads via the Rev AI dashboard or using the source_config job parameter have a file size limit of 5 TB.

POST requests to the /jobs endpoint that use the source_config property do not have a concurrency limit or file restriction. They are only limited by the first three limits specified above.

attention

These default limits are configurable by Rev AI support. To adjust these limits, contact the support team at support@rev.ai.

HIPAA compliance

The API supports HIPAA-compliant processing. However, this feature is not activated by default and must be explicitly activated at account level. Learn more about Rev AI's HIPAA compliance and how to HIPAA-enable a Rev AI user account.

The API has the following limitations in HIPAA context:

  1. When submitting a media file, the media_url option is not supported. Instead, the source_config option must be used.
  2. Human transcription ( transcriber=human ) is not supported.

Error codes

The API indicates failure with 4xx and 5xx HTTP status codes. 4xx status codes indicate an error due to the request provided (for example, a required parameter was omitted). 5xx error indicate an error with Rev AI's servers.

The following table lists common 4xx error codes and troubleshooting suggestions for each:

Status Code Error Troubleshooting
401 Authorization Denied Verify that the access token is valid.
403 Access Denied to Deployment Verify that the API endpoint in use is correct for your selected deployment or region.
405 Invalid Job Properties Verify that the job parameter names and values are valid.
404 Resource Not Found Verify that the job identifier is valid.
406 Unsupported Output Format Verify that the requested output format is valid.
409 Invalid Job State Verify that the job has completed processing.
413 Payload Too Large Verify that the submitted file size is within the allowed API limits.

When a 4xx error occurs during invocation of a request, the API responds with a problem details HTTP response payload.

Some errors can be resolved simply by retrying the request. The following error codes are likely to be resolved with successive retries.

Status Code Error
409 Invalid Job State
429 Too Many Requests
502 Bad Gateway
503 Service Unavailable
504 Gateway Timeout
attention

With the exception of the 429 status code, it is recommended that the maximum number of retries be limited to 5 attempts per request. The number of retries can be higher for 429 errors but if you notice consistent throttling, please contact the support team at support@rev.ai.

Error object

The problem details information is represented as a JSON object with the following optional properties:

Property Description
type A URL representing the type for the error
title A short human readable description of type
details Additional details of the error
status HTTP status code of the error

In addition to the properties listed above, the problem details object may list additional properties that help to troubleshoot the problem.

Here is an example of a response to a job request with a missing required parameter:

Copy
Copied
// Bad Submit Job Request
{
  "parameter": {
    "source_config.url": [
      "The url field is required."
    ]
  },
  "type": "https://www.rev.ai/api/v1/errors/invalid-parameters",
  "title": "Your request parameters didn't validate",
  "status": 400
}

Here is an example of a response to an invalid job:

Copy
Copied
// Invalid Transcript State
{
  "allowed_values": [
    "transcribed"
  ],
  "current_value": "in_progress",
  "type": "https://rev.ai/api/v1/errors/invalid-job-state",
  "title": "Job is in invalid state",
  "detail": "Job is in invalid state to obtain the transcript",
  "status": 409
}

Billing

For billing purposes, files are charged per second, with a minimum charge of 15 seconds.

Here are some examples:

  • 4-second files are charged as 15 seconds.
  • 14.1-second files are charged as 15 seconds.
  • 15-second files are charged as 15 seconds.
  • 16-second files are charged as 16 seconds.
  • 16.1-second files are charged as 17 seconds.
  • 22.7-second files are charged as 23 seconds.

Human transcription files are charged per second with a minimum charge of 1 minute.