Skip to content

Changelog

This page documents all notable changes to this project.

Added

  • Added Whisper Fusion transcriber support that has better support for rare words. See Submit Transcription Job for more details

2024-12-06 / Java SDK

Version 2.5.0

Added

  • Added option for setting deployment configuration for the Asynchronous Speech-to-Text API Client and Language Identification API Client. See Rev AI Global Deployments for more details

Added

  • Added Forced Alignment feature
  • Added Asynchronous Speech-to-Text API forced_alignment submission option.

Added

  • Premium Diarization feature goes out of Beta and becomes publicly available.

Added

  • Updated US deployment of the Asynchronous Speech-to-Text API to support new low-cost transcription using Reverb Turbo model.
  • It can be used by specifying "transcriber": "low_cost" in the request.

2024-01-05 / Java SDK

Version 2.4.2

Fixed

  • Updated summarization parameter to use SummarizationModel for the Asynchronous Speech-to-Text API Client
  • Updated translation parameter to use TranslationModel for the Asynchronous Speech-to-Text API Client

Added

  • Updated US deployment of the Asynchronous Speech-to-Text API to support asynchronous Translation and Summarization

2023-12-28 / Java SDK

Version 2.4.0

Added

  • summarization and translation parameters to the Asynchronous Speech-to-Text API Client

Added

  • Updated US deployment of the Asynchronous Speech-to-Text API language submission option to support new languages: Afrikaans, Armenian, Azerbaijani, Belarusian, Bosnian, Estonian, Galician, Icelandic, Kannada, Kazakh, Macedonian, Marathi, Nepali, Serbian, Swahili, Tagalog, Thai, Ukrainian, Urdu, Vietnamese, Welsh, and multilingual English/Spanish.
  • Updated US deployment of the Asynchronous Speech-to-Text API HIPAA-supported language list to all languages: Afrikaans, Arabic, Armenian, Azerbaijani, Belarusian, Bosnian, Bulgarian, Catalan, Croatian, Czech, Danish, Dutch, English, Estonian, Farsi, Finnish, French, Galician, German, Greek, Hebrew, Hindi, Hungarian, Icelandic, Indonesian, Italian, Japanese, Kannada, Kazakh, Korean, Latvian, Lithuanian, Macedonian, Malay, Mandarin, Marathi, Nepali, Norwegian, Polish, Portuguese, Romanian, Russian, Serbian, Slovak, Slovenian, Spanish, Swahili, Swedish, Tagalog, Tamil, Telugu, Thai, Turkish, Ukrainian, Urdu, Vietnamese, Welsh, and multilingual English/Spanish.
  • Updated US deployment of the Asynchronous Speech-to-Text API media file duration limits. All languages except Telugu supports file duration up to 17 hours, and Telugu supports up to 6 hours.

2023-10-31 / Node SDK

Version 3.7.0

Added

  • Added diarization_type parameter to Asynchronous Speech-To-Text API job options.

Added

  • Updated Asynchronous Speech-to-Text API diarization_type submission option.

Changed

  • Updated Asynchronous Speech-to-Text API to support English US (en-us) and English UK (en-gb) language values.

Added

  • Updated Asynchronous Speech-to-Text API speakers_count submission option.

Added

  • (Open Beta) Added Forced Alignment API documentation

Added

  • Updated Asynchronous Speech-to-Text API speaker_channels_count submission option documentation to include valid languages (en, es, fr).

2023-03-29 / Node SDK

Version 3.6.1

Added

  • Added enable_speaker_switch parameter to Streaming Speech-To-Text API job options.

2023-03-29 / Node SDK

Version 3.6.2

Added

  • Fixed an issue with handling API error responses when the error object response property is undefined.

Added

  • Added a new job submission remove_atmospherics option to the Asynchronous Speech-to-Text API. This option enables you to remove atmospherics such as <laugh>, <affirmative> etc from the ASR output.

Changed

  • Changed the submission option verbatim of async transcription job. Now it can be used with both machine and human transcribers. The option default value depends on the transcriber.
TranscriberDescription
machinethe default is true. To turn it off false should be explicitly provided
humanthe default is false To turn it on true should be explicitly provided

2022-11-14 / Node SDK

Version 3.6.0

Added

  • Added functionality to enable the caller to specify which Rev AI region to use for submission. Use new RevAiApiClientConfig when constructing your RevAiApiClient. This is available for the Asynchronous Speech-To-Text API, Streaming Speech-To-Text API, and Language Id API clients. See https://github.com/revdotcom/revai-node-sdk#usage for details
  • Added speaker_names parameter to Asynchronous Speech-To-Text API job options. This is available for Human Transcription only
  • Added skip_postprocessing parameter to both Asynchronous Speech-To-Text API and Streaming Speech-To-Text API.

Changed

  • Streams are now billed based on the maximum of stream duration and audio duration. Refer to the Billing section for more details.

Added

2022-09-24 / Rev AI API

Changed

  • All Rev AI job identifiers are now 16 characters in length (increased from 12 characters previously).

Added

  • Added support for Streaming Speech-to-Text API (English only) in the European Union deployment.

Removed

  • Deprecated machine_v2 as an option for transcriber. Using machine is now the recommended option. Usage of machine_v2 will silently route to machine.

Added

Added

  • Added support for asynchronous non-English Speech-to-Text transcription in the European Union deployment.

Added

  • Added a new priority option to the Streaming Speech-to-Text API. Possible values are speed and accuracy. Only available for English and Spanish languages with machine_v2 transcriber.

Changed

  • balance_seconds response value deprecated and replaced with free_balance, purchased_balance, total_balance and invoiced_balance values instead. The balance_seconds value will continue to be included in the response but will always have a value of 0.

2022-08-02 / Java SDK

Version 2.3.2

Added

  • skip_punctuation parameter to the Streaming Speech-to-Text API Client
  • skip_punctuation parameter to the Asynchronous Speech-to-Text API Client

2022-08-02 / Python SDK

Version 2.17.1

Added

  • skip_punctuation parameter to the Streaming Speech-to-Text API Client
  • skip_punctuation parameter to the Asynchronous Speech-to-Text API Client

Changed

  • Streaming Speech-to-Text API v2 moved from Open Beta to General Availability

Added

  • Added a new job submission skip_postprocessing option to the Asynchronous Speech-to-Text API. This option enables you to skip the post-processing steps (inverse text normalization or ITN, casing and punctuation) of a transcription job.

Changed

  • Human Transcription feature
    • segments_to_transcribe minimum segment length lowered from 2 minutes to 1 minute

Changed

  • Multipart/form-data submission no longer requires options parameter

2022-06-08 / Java SDK

Version 2.3.1

Fixed

  • Error response when submitting forbidden parameters, such as media_url for HIPAA accounts. See https://docs.rev.ai/resources/tutorials/introduction-to-auth-options/ for proper usage.

2022-06-08 / Node SDK

Version 3.5.1

Fixed

  • Response when submitting forbidden parameters, such as media_url for HIPAA accounts. See https://docs.rev.ai/resources/tutorials/introduction-to-auth-options/ for proper usage.

2022-06-03 / Java SDK

Version 2.3.0

Added

Added

2022-06-01 / Node SDK

Version 3.5.0

Added

2022-05-26 / Node SDK

Version 3.4.0

Added

Added

  • Added machine_v2 transcriber. Routes to our new improved v2 model.
  • Added enable_speaker_switch option. Only available for v2 streams.

Added

  • Added a new skip_postprocessing option to the Streaming Speech-to-Text API. This option allows you to skip the post-processing steps (inverse text normalization or ITN, casing and punctuation) of a transcription job. Only available for English and Spanish languages.

Changed

  • Language Identification API moved from Open Beta to General Availability

2022-05-20 / Java SDK

Version 2.2.0

Added

  • language parameter to the Streaming Speech-to-Text API Client

Changed

  • The 8 languages for the Streaming Speech-to-Text API are out of Open Beta and in General Availability: French, German, Italian, Japanese, Korean, Mandarin, Portuguese, and Spanish.

2022-05-19 / Java SDK

Version 2.1.0

Added

Deprecated

2022-05-19 / Python SDK

Version 2.17.0

Added

  • language parameter to the Streaming Speech-to-Text API Client

2022-05-17 / Node SDK

Version 3.3.0

Added

  • language parameter to the Streaming Speech-to-Text API Client

2022-05-13 / Node SDK

Version 3.2.0

Added

Deprecated

2022-05-13 / Python SDK

Version 2.16.0

Added

Added

Deprecated

Added

  • source_config as a replacement for the deprecated media_url to a provide a source URL for a job
  • Support for authorization headers when accessing URLs for source_config

Added

  • Added v1 route for Language Identification API: languageid/v1

2022-04-25 / Python SDK

Version 2.15.0

Added

Added

  • Added 8 new languages to the Streaming Speech-to-Text API in Open Beta: French, German, Italian, Japanese, Korean, Mandarin, Portuguese, and Spanish.

Added

  • RTMP audio streaming documentation

Added

Added

  • (Open Beta) Added Language Identification API documentation

Added

  • custom_vocabulary as a possible failure for a failed job

2022-02-07 / Node SDK

Version 3.0.0

Added

  • transcriber to asynchronous client
  • verbatim, rush, test_mode and segments_to_transcribe options to asynchronous client for human transcription
  • start_ts and transcriber to streaming client

Fixed

  • Fixed a bug where binary data containing bytes equivalent to string "EOS" prematurely ends the streaming session

2022-02-07 / Node SDK

Version 3.1.0

Added

  • Support for Node 14, 16 and 17 (supported versions now include: 8, 10, 12, 14, 16, 17)
  • custom_vocabulary_id to asynchronous client
  • detailed_partials to streaming client

2022-02-01 / Java SDK

Version 1.14.0

Added

  • custom_vocabulary_id and transcriber to asynchronous client
  • verbatim, rush, test_mode and segments_to_transcribe options to asynchronous client for human transcription
  • detailed_partials, start_ts and transcriber to streaming client

2022-01-31 / Python SDK

Version 2.14.0

Added

  • transcriber to asynchronous client
  • verbatim, rush, test_mode and segments_to_transcribe options to asynchronous client for human transcription
  • start_ts and transcriber to streaming client

Added

  • (Open Beta) machine_v2 as an option for transcriber to run our Reverb ASR model for improved Word Error Rate.

Added

  • Human Transcription feature
    • (Open Beta) transcriber option to enable asynchronous transcription job submissions to be transcribed by a human.
    • (Open Beta) verbatim option to enable asynchronous transcription job submissions
    • (Open Beta) rush option to enable asynchronous transcription job submissions
    • (Open Beta) segments_to_transcribe option to enable asynchronous transcription job submissions

Added

  • (Open Beta) Added Sentiment Analysis API documentation

Added

  • (Open Beta) Added Topic Extraction API documentation

2021-10-12 / Python SDK

Version 2.13.0

Added

  • detailed_partials parameter to the streaming client
  • CI now runs on GitHub Actions. This replaces Travis CI.

2021-10-06 / Node SDK

Version 2.6.2

Fixed

  • Fixed a bug where the HTTP client library was artificially lowering the max file size for multipart upload to 10MB. The API limit is 2GB. More information in revai-node-sdk issue #72.

Added

  • Ability to rotate access tokens

Added

  • (Open Beta) Support for transcription for more languages

Added

  • Enable offsetting hypotheses timestamps by providing start_ts to streaming jobs

Changed

  • Max allowed stream duration increased from 2 to 3 hours

2021-03-09 / Python SDK

Version 2.12.0

Added

  • custom_vocabulary_id option to enable job submission with the id of a pre-submitted custom vocabulary

2021-02-10 / Node SDK

Version 2.6.2

Fixed

  • Bug fixes for streaming client

Changed

  • language job option is out of Open Beta and in General Availability.

Added

  • custom_vocabularies job option support for Rev AI's non-English languages. These are French, German, Portuguese and Spanish.

Added

  • Limit on non-English language transcription audio of 12 hours or less.

2021-01-29 / Node SDK

Version 2.6.1

Fixed

  • Bug fix for streaming client crash on unsafeEnd

2021-01-17 / Java SDK

Version 1.3.0

Added

  • language job option to the Asynchronous Speech-to-Text API. Transcribe audio in languages other than English. See Asynchronous Speech-to-Text API docs for the full list of supported languages.

2021-01-17 / Python SDK

Version 2.11.0

Added

  • language job option to the Asynchronous Speech-to-Text API. Transcribe audio in languages other than English. See Asynchronous Speech-to-Text API docs for the full list of supported languages.
  • Relax dependency pinned version requirements.

2021-01-15 / Node SDK

Version 2.6.0

Added

Changed

  • Reverted minor breaking change introduced on November 9 involving job failure types. "duration_out_of_range" failure type has been reverted to "duration_exceeded" and a new failure type of "duration_too_short" was introduced to cover the minimum case. See get job endpoint documentation response schema for full enum of failures.

Added

Changed

  • Changed "duration_exceeded" job failure type to "duration_out_of_range" to account for both too short and too long durations of files.

2020-09-01 / Node SDK

Version 2.5.0

Added

  • delete_after_seconds option for both Streaming and Asynchronous Speech-to-Text APIs

Fixed

  • Bug in Streaming Speech-to-Text API client where the client closed the WebSocket connection after 1 minute of not sending any data

Added

  • (Open Beta) custom_vocabulary_id option to enable job submission with the id of a pre-completed custom vocabulary

Added

  • (Open Beta) detailed_partials option to show timestamps and confidence scores in partial hypotheses

2020-07-22 / Java SDK

Version 1.1.0

Added

  • CustomVocabularyClient: Interact with the Custom Vocabulary API for pre-uploading custom vocabulary
  • remove_disfluencies option for both Asynchronous and Streaming Speech-to-Text API clients
  • filter_profanity option for streaming client

2020-07-22 / Python SDK

Version 2.9.0

Added

  • delete_custom_vocabulary(id): Delete your custom vocabulary by id
  • get_list_of_custom_vocabularies(): Get a list of recent custom vocabulary submissions' information
  • remove_disfluencies job option for the streaming client: Remove filler words (disfluencies) from the resulting transcript. This option was previously available for the Asynchronous Speech-to-Text API client.

Changed

  • Improved examples

Fixed

  • Bug fixes and improvements

2020-07-01 / Node SDK

Version 2.4.0

Added

  • deleteCustomVocabulary(id): Delete your custom vocabulary by id
  • getListOfCustomVocabularyInformations(): Get a list of recent custom vocabulary submissions' information
  • remove_disfluencies job option for Streaming Speech-to-Text API: Remove filler words (disfluencies) from the resulting transcript. This option was previously available for the Asynchronous Speech-to-Text API.

Changed

  • Improved examples

Fixed

  • Bug fixes and improvements

Added

  • (Closed Beta) Option to show timestamps and confidence scores in partial hypotheses. Email support@rev.ai for access.
  • (Closed Beta) Stream to Rev AI with RTMP. Email support@rev.ai for access.

2020-05-23 / Java SDK

Version 1.0.0

Added

  • Initial release of the Java SDK available on Maven Central Repository

Changes previous to the oldest date in this document are not noted in this changelog.