# Get Started with Speech Recognition in Python

**By Vikram Vaswani, Developer Advocate - September 27, 2022**

## Introduction

Rev AI's speech-to-text APIs power automatic speech recognition in thousands of applications and services. To make it easier for developers to integrate these APIs into their applications, Rev AI also offers [SDKs for many programing languages](/sdk)...including the topic of this tutorial, Python.

In this tutorial, I'll introduce you to the basics of using Rev AI's [Asynchronous Speech-to-Text API](/api/asynchronous) using Python and the [Rev AI Python SDK](/sdk/python). If you've ever wondered how to integrate speech recognition capabilities with your Python application, this tutorial will give you all the information you need to get started.

## Assumptions

This tutorial assumes that:

- You have a Rev AI account and access token. If not, [sign up for a free account](https://www.rev.ai/auth/signup) and [generate an access token](/get-started#step-1-get-your-access-token).
- You have a properly-configured Python development environment with Python 3.x. If not, [download and install Python](https://www.python.org/downloads/) for your operating system.
- You have installed pip, the Python dependency manager. If not, [download and install pip](https://pip.pypa.io/en/stable/installation/) for your operating system.
- You have an audio file to transcribe. If not, use this [example audio file from Rev AI](https://www.rev.ai/FTC_Sample_1.mp3).


## Step 1: Install the SDK

This tutorial will use the [Rev AI Python SDK](/sdk/python) to submit transcription requests to the Rev AI [Asynchronous Speech-to-Text API](/api/asynchronous).

Begin by installing the SDK with pip:


```bash
pip install --upgrade rev_ai
```

Within your application code, initialize the Rev AI API client as below. Replace the `<REVAI_ACCESS_TOKEN>` placeholder with your Rev AI access token:


```python
from rev_ai import apiclient

# configure access token
token = "<REVAI_ACCESS_TOKEN>"

# initialize Rev AI API client
client = apiclient.RevAiAPIClient(token)
```

Here, the Rev AI API client is automatically initialized with the base endpoint for the Asynchronous Speech-to-Text API, which is `https://api.rev.ai/speechtotext/v1/`.

Every request to the API must be in JSON format and must include an `Authorization` header containing the API access token. The Rev AI Python SDK automatically takes care of attaching this required header to all its client requests.

## Step 2: Submit a file for transcription

To generate a transcript from an audio file, you must submit an HTTP POST request to the API endpoint at `https://api.rev.ai/speechtotext/v1/jobs`. The Rev AI Python SDK simplifies this process with two methods: `submit_job_local_file()` and `submit_job_url()`, for local and remote files respectively.

The following example demonstrates how to submit a local audio file for transcription.

To use this example, replace the `<FILEPATH>` placeholder with the path to the file you wish to transcribe and the `<REVAI_ACCESS_TOKEN>` placeholder with your Rev AI account's access token.


```python
from rev_ai import apiclient

# configure access token and audio source
token = "<REVAI_ACCESS_TOKEN>"
filepath = "<FILEPATH>"

# initialize Rev AI API client
client = apiclient.RevAiAPIClient(token)

# submit a file for transcription
job = client.submit_job_local_file(filepath)

# get job id
job_id = job.id
print("Job submitted with id: " + job_id)
```

To run this example, save it as a file, such as `example.py` and then execute `python example.py`.

In this example, the API client internally makes a POST request to the API, passing it the audio file to be transcribed. The response body is then received and converted into a Python object.

Here is an example of the API response, represented as a Python object:


```python
{'callback_url': (None,),
 'completed_on': None,
 'created_on': '2022-09-14T14:43:35.46Z',
 'custom_vocabulary_id': None,
 'delete_after_seconds': None,
 'duration_seconds': None,
 'failure': None,
 'failure_detail': None,
 'filter_profanity': None,
 'id': 'xsDRpD6ladtf',
 'language': 'en',
 'media_url': None,
 'metadata': None,
 'name': 'myfile.mp3',
 'remove_disfluencies': None,
 'rush': None,
 'segments_to_transcribe': None,
 'skip_diarization': None,
 'skip_punctuation': None,
 'speaker_channels_count': None,
 'status': <JobStatus.IN_PROGRESS: 1>,
 'transcriber': None,
 'verbatim': None}
```

The API response contains a job identifier (`id` field). This job identifier will be required to check the job status and obtain the job result.

It is also possible to use a remote audio file, as shown in the following example:


```python
from rev_ai import apiclient

# configure access token and audio source
token = "<REVAI_ACCESS_TOKEN>"
url = "<URL>"

# initialize Rev AI API client
client = apiclient.RevAiAPIClient(token)

# submit a file for transcription
job = client.submit_job_url(url)

# get job id
job_id = job.id
print("Job submitted with id: " + job_id)
```

[Learn more about submitting an asynchronous transcription job in the API reference guide](/api/asynchronous).

## Step 3: Check transcription status

To check the status of the transcription job, you must submit an HTTP GET request to the API endpoint at `https://api.rev.ai/speechtotext/v1/jobs/<ID>`, where `<ID>` is a placeholder for the job identifier. Again, the Rev AI Python SDK makes this easy with its `get_job_details()` method, which accepts a job identifier as input and returns the current status of the job as a Python object.

The following example demonstrates how to check the status of an asynchronous transcription job.

To use this example, replace the `<ID>` placeholder with the job identifier and the `<REVAI_ACCESS_TOKEN>` placeholder with your Rev AI account's access token.


```python
from rev_ai import apiclient

# configure access token and job identifier
token = "<REVAI_ACCESS_TOKEN>"
job_id = "<ID>"

# initialize Rev AI API client
client = apiclient.RevAiAPIClient(token)

# check job status
status = client.get_job_details(job_id)

# print response object
print(vars(status))
```

Here is an example of the response object received after the job has completed:


```python
{'callback_url': (None,),
 'completed_on': '2022-09-14T14:44:09.774Z',
 'created_on': '2022-09-14T14:43:35.46Z',
 'custom_vocabulary_id': None,
 'delete_after_seconds': None,
 'duration_seconds': 107.0,
 'failure': None,
 'failure_detail': None,
 'filter_profanity': None,
 'id': 'xsDRpD6ladtf',
 'language': 'en',
 'media_url': None,
 'metadata': None,
 'name': 'myfile.mp3',
 'remove_disfluencies': None,
 'rush': None,
 'segments_to_transcribe': None,
 'skip_diarization': None,
 'skip_punctuation': None,
 'speaker_channels_count': None,
 'status': <JobStatus.TRANSCRIBED: 2>,
 'transcriber': None,
 'verbatim': None}
```

[Learn more about retrieving the status of an asynchronous transcription job in the API reference guide](/api/asynchronous).

## Step 4: Retrieve the transcript

Once the job's `status` changes to `TRANSCRIBED`, you can retrieve the results by submitting an HTTP GET request to the API endpoint at `https://api.rev.ai/speechtotext/v1/jobs/<ID>/result`, where `<ID>` is a placeholder for the job identifier. The Rev AI Python SDK offers three methods for this: `get_transcript_text()`, `get_transcript_json()` and `get_transcript_object()`, which return the transcript as plaintext, JSON and a Python object respectively.

The following example demonstrates how to retrieve the results of an asynchronous transcription job.

To use this example, replace the `<ID>` placeholder with the job identifier and the `<REVAI_ACCESS_TOKEN>` placeholder with your Rev AI account's access token.


```python
from rev_ai import apiclient

# configure access token and job identifier
token = "<REVAI_ACCESS_TOKEN>"
job_id = "<ID>"

# initialize Rev AI API client
client = apiclient.RevAiAPIClient(token)

# get transcript
transcript = client.get_transcript_json(job_id)

# print transcript
print(transcript)
```

Here is an example of the transcript returned from a successful job, represented as JSON:


```javascript
{
  "monologues": [
    {
      "speaker": 0,
      "elements": [
        {
          "type": "text",
          "value": "Hi",
          "ts": 0.17,
          "end_ts": 0.52,
          "confidence": 1
        },
        {
          "type": "punct",
          "value": ","
        },
        {
          "type": "punct",
          "value": " "
        },
        {
          "type": "text",
          "value": "my",
          "ts": 0.52,
          "end_ts": 0.76,
          "confidence": 1
        },
        ...
      ]
    },
    ...
  ]
}
```

[Learn more about obtaining a transcript in the API reference guide](/api/asynchronous).

## Step 5: Create and test a simple application

Using the code samples shown previously, it's possible to create a simple application that accepts an audio file URL and returns a transcript, as shown below:


```python
from rev_ai import apiclient
from time import sleep

def main(token, url):
  # initialize Rev AI API client
  client = apiclient.RevAiAPIClient(token)

  # submit a file for transcription
  job = client.submit_job_url(url)

  # get job id
  job_id = job.id
  print("Job submitted with id: " + job_id)

  # check job status
  while (job.status.name == 'IN_PROGRESS'):
    details = client.get_job_details(job_id)
    print("Job status: " + details.status.name)
    # if successful, print result
    if (details.status.name == 'TRANSCRIBED'):
      print(client.get_transcript_json(job_id))
      break
    # if unsuccessful, print error
    if (details.status.name == 'FAILED'):
      print("Job failed: " + details.failure_detail)
      break
    sleep(30)

token = "<REVAI_ACCESS_TOKEN>"
url = "<URL>"
main(token, url)
```

This example application begins by initializing an instance of the `RevAiAPIClient` object, passing the Rev AI access token to the object constructor. It then submits a remote file for transcription using the object's `submit_job_url()` method. It then uses the `get_job_details()` method to repeatedly poll the API every 30 seconds to obtain the status of the job. Once the job status is no longer `IN_PROGRESS`, it uses the `get_transcript_json()` method to retrieve the transcript and prints it to the console.

Here is an example of the output generated by the example application:


```bash
Job submitted with id: XyHxoqX5cH5A
Job status: IN_PROGRESS
Job status: IN_PROGRESS
Job status: TRANSCRIBED
{'monologues': [{'speaker': 0, 'elements': [{'type': 'text', 'value': 'Hi', 'ts': 0.17, 'end_ts': 0.52, 'confidence': 1.0}, {'type': 'punct', 'value': ','}, ...]}, ..., ]}
```

The example above polls the API repeatedly to check the status of the transcription job. This is presented only for illustrative purposes and is **strongly recommended against** in production scenarios. For production scenarios, use [webhooks](/api/asynchronous/webhooks) to asynchronously receive notifications once the job completes.

## Next steps

Learn more about the topics discussed in this tutorial by visiting the following links:

- Documentation: [Asynchronous Speech-To-Text API job submission](/api/asynchronous)
- Documentation: [Python SDK](/sdk/python)
- Documentation: [Asynchronous Speech-To-Text API best practices](/api/asynchronous/best-practices)
- Code samples: [Asynchronous Speech-To-Text API](/api/asynchronous/code-samples) and [Python SDK](/sdk/python/code-samples)
- Tutorial: [Get Started with Rev AI API Webhooks](/resources/tutorials/get-started-api-webhooks)