The Custom Vocabulary API improves the accuracy of ASR transcription in both streaming and asynchronous modes.

API endpoint

The base URL for this version of the API is https://api.rev.ai/speechtotext/v1. All endpoints described in this documentation are relative to this base URL.


Clients must authenticate by including their Rev AI access token as a query parameter in their requests. If the access token is invalid or the query parameter is not present, a 401 error code will be returned.

General rules

Custom vocabularies are submitted as a list of phrases. A phrase can be one word or multiple words, usually describing a single object or concept.

  • Up to 6000 phrases may be submitted per transcription job for English, and up to 1000 for other languages.
  • Phrases must contain at least one alphabetic character from the respective language.
  • Phrases cannot be longer than 12 words.
  • Individual words cannot be longer than 34 characters.
  • For English, non-numeric characters in the Basic Latin set are allowed, i.e. (U+0000-U+002F and U+003A-U+007F). For other languages, most non-numeric characters in the language are allowed.
  • Non-alphabetic characters will be ignored during speech recognition, but will be favored in the output. For example, if you submit Yahoo! as a custom vocabulary, speech recognition will favor outputting Yahoo! when it recognizes 'yahoo' in the audio.

Rules for initialisms

Initialisms are abbreviations consisting of initial letters pronounced separately, like CPU. Submit your initialisms as custom vocabulary to improve their speech recognition.

The following rules apply:

  • Initialisms have to contain at least 3 letters
  • An initialism will be recognized only when pronounced letter by letter
  • Ampersands & are supported and will be treated as a letter pronounced like and
  • Initialisms will be recognized when submitted in the following formats only:
    • ABC
    • A.B.C.
    • a.b.c.

Rules for non-alphabetic characters

The following rules apply for specific characters:

  • Numbers are not allowed, but note that some common terms like 401k are recognized by default and as such do not need to be added as custom vocabulary
  • Standalone ampersands in phrases will be treated like the word and
  • Dashes in words will be ignored, for example this-and-that will be treated as a single word roughly pronounced like this and that