Get Started with Speech Recognition in PHP
By Vikram Vaswani, Developer Advocate - May 31, 2022
Introduction
Rev AI offers a suite of speech-to-text APIs to help developers build automatic speech recognition (ASR) into their applications. These APIs cover a variety of use cases, including live and pre-recorded audio transcription, language identification, sentiment analysis and topic extraction.
To help developers integrate these APIs into their applications, Rev AI also offers SDKs for Node, Java and Python. However, because most of these APIs are REST APIs, it's also easy to use them with other languages...including one of my most frequently-used ones, PHP.
In this tutorial, I'll introduce you to the basics of using Rev AI's Asynchronous Speech-to-Text API using PHP and Guzzle. If you've ever wondered if you could add speech recognition to your PHP application, but didn't know where to start, this tutorial will give you all the information you need to start making requests to, and handling responses from, Rev AI's ASR APIs.
Assumptions
This tutorial assumes that:
- You have a Rev AI account and access token. If not, sign up for a free account and generate an access token .
- You have a properly-configured PHP development environment with PHP 7.4.x or PHP 8.0.x. If not, download and install PHP for your operating system.
- You have installed Composer, the PHP dependency manager. If not, download and install Composer for your operating system.
- You have an audio file to transcribe. If not, use this example audio file from Rev AI .
Step 1: Install Guzzle
The Asynchronous Speech-to-Text API is a REST API and, as such, you will need an HTTP client to interact with it. This tutorial uses Guzzle 7.x, a popular PHP HTTP client.
Begin by installing Guzzle into your application directory with Composer:
composer require guzzlehttp/guzzle:^7.0
Within your application code, initialize Guzzle as below. Replace the <REVAI_ACCESS_TOKEN>
placeholder with your Rev AI access token:
<?php
require __DIR__ . '/vendor/autoload.php';
use GuzzleHttp\Client;
$token = '<REVAI_ACCESS_TOKEN>';
$client = new Client([
'base_uri' => 'https://api.rev.ai/speechtotext/v1',
'headers' => ['Authorization' => "Bearer $token"]
]);
Here, the Guzzle HTTP client is initialized with the base endpoint for the Asynchronous Speech-to-Text API, which is https://api.rev.ai/speechtotext/v1/
.
Every request to the API must be in JSON format and must include an Authorization
header containing your API access token. The code shown above also attaches this required header to the client.
Step 2: Submit a file for transcription
To generate a transcript from an audio file, you must begin by submitting an HTTP POST request to the API endpoint at https://api.rev.ai/speechtotext/v1/jobs
.
The following example demonstrates how to submit a remote audio file for transcription.
To use this example, replace the <URL>
placeholder with the public URL to the file you wish to transcribe and the <REVAI_ACCESS_TOKEN>
placeholder with your Rev AI account's access token.
<?php
require __DIR__ . '/vendor/autoload.php';
use GuzzleHttp\Client;
$token = '<REVAI_ACCESS_TOKEN>';
$fileUrl = '<URL>';
// create client
$client = new Client([
'base_uri' => 'https://api.rev.ai/speechtotext/v1/',
'headers' => ['Authorization' => "Bearer $token"]
]);
// send POST request and get response body
$response = $client->request(
'POST',
'jobs',
['json' => ['source_config' => ['url' => $fileUrl]]]
)
->getBody()
->getContents();
// decode response JSON and print
print_r(json_decode($response));
This example makes a POST request to the API, passing it the URL to the audio file to be transcribed as a JSON document. The response body is then received, parsed and decoded into a PHP object and printed to the console.
To run this example, save it as a file, such as example.php
and then execute php example.php
.
Here is an example of the script output, representing the API response:
stdClass Object
(
[id] => sTfRgVlLCYkt
[created_on] => 2022-04-06T13:35:40.6Z
[name] => FTC_Sample_1.mp3
[media_url] => https://www.rev.ai/FTC_Sample_1.mp3
[status] => in_progress
[type] => async
[language] => en
)
The API response contains a job identifier (id
field). This job identifier will be required to check the job status and obtain the job result.
It is also possible to use a local audio file and submit it to the API as multipart/form-data
.
The following example demonstrates how to submit a local audio file for transcription.
To use this example, replace the <FILEPATH>
placeholder with the path to the file you wish to transcribe and the <REVAI_ACCESS_TOKEN>
placeholder with your Rev AI account's access token.
<?php
require __DIR__ . '/vendor/autoload.php';
use GuzzleHttp\Client;
$token = '<REVAI_ACCESS_TOKEN>';
$file = '<FILEPATH>';
// create client
$client = new Client([
'base_uri' => 'https://api.rev.ai/speechtotext/v1/',
'headers' => ['Authorization' => "Bearer $token"]
]);
// send POST request and get response body
$response = $client->request(
'POST',
'jobs',
['multipart' => [['name' => 'media','contents' => fopen($file, 'r')]]]
)
->getBody()
->getContents();
// decode response JSON and print
print_r(json_decode($response));
Step 3: Check transcription status
To check the status of the transcription job, you must submit an HTTP GET request to the API endpoint at https://api.rev.ai/speechtotext/v1/jobs/<ID>
, where <ID>
is a placeholder for the job identifier.
The following example demonstrates how to check the status of an asynchronous transcription job.
To use this example, replace the <ID>
placeholder with the job identifier and the <REVAI_ACCESS_TOKEN>
placeholder with your Rev AI account's access token.
<?php
require __DIR__ . '/vendor/autoload.php';
use GuzzleHttp\Client;
$token = '<REVAI_ACCESS_TOKEN>';
$jobId = '<ID>';
// create client
$client = new Client([
'base_uri' => 'https://api.rev.ai/speechtotext/v1/',
'headers' => ['Authorization' => "Bearer $token"]
]);
// send GET request and get response body
$response = $client->request(
'GET',
"jobs/$jobId"
)
->getBody()
->getContents();
// decode response JSON and print
print_r(json_decode($response));
Here is an example of the script output after the job has completed:
stdClass Object
(
[id] => sTfRgVlLCYkt
[created_on] => 2022-04-06T13:35:40.6Z
[completed_on] => 2022-04-06T13:36:16.275Z
[name] => FTC_Sample_1.mp3
[media_url] => https://www.rev.ai/FTC_Sample_1.mp3
[status] => transcribed
[duration_seconds] => 107
[type] => async
[language] => en
)
Step 4: Retrieve the transcript
Once the job's status
changes to transcribed
, you can retrieve the results by submitting an HTTP GET request to the API endpoint at https://api.rev.ai/speechtotext/v1/jobs/<ID>/result
, where <ID>
is a placeholder for the job identifier.
The following example demonstrates how to retrieve the results of an asynchronous transcription job.
To use this example, replace the <ID>
placeholder with the job identifier and the <REVAI_ACCESS_TOKEN>
placeholder with your Rev AI account's access token.
<?php
require __DIR__ . '/vendor/autoload.php';
use GuzzleHttp\Client;
$token = '<REVAI_ACCESS_TOKEN>';
$jobId = '<ID>';
// create client
$client = new Client([
'base_uri' => 'https://api.rev.ai/speechtotext/v1/',
'headers' => ['Authorization' => "Bearer $token"]
]);
// send GET request and get response body
$response = $client->request(
'GET',
"jobs/$jobId/transcript",
['headers' => ['Accept' => 'application/vnd.rev.transcript.v1.0+json']]
)
->getBody()
->getContents();
// decode response JSON and print
print_r(json_decode($response));
Here is an example of the transcript returned from a successful job, represented as a PHP object:
stdClass Object
(
[monologues] => Array
(
[0] => stdClass Object
(
[speaker] => 0
[elements] => Array
(
[0] => stdClass Object
(
[type] => text
[value] => Hi
[ts] => 0.27
[end_ts] => 0.48
[confidence] => 1
)
[1] => stdClass Object
(
[type] => punct
[value] => ,
)
[2] => stdClass Object
(
[type] => punct
[value] =>
)
...
)
)
)
)
Step 5: Create and test a simple application
Using the code samples shown previously, it's possible to create a custom Rev AI API client class encapsulating these functions:
<?php
require __DIR__ . '/vendor/autoload.php';
use GuzzleHttp\Client;
class RevAiApiClient extends Client
{
/**
* @var $client GuzzleHttp client object
*
*/
private $client;
/**
* Construct API client with default base path
* and authorization
*
* @param string $token Rev AI access token
*/
public function __construct($token)
{
if (!isset($token)) {
throw new Exception('Access token missing');
}
$this->client = new Client([
'base_uri' => 'https://api.rev.ai/speechtotext/v1/',
'headers' => ['Authorization' => "Bearer $token"],
]);
}
/**
* Submit a remote audio file for transcription
*
* @param string $fileUrl URL to remote file
*
* @return stdClass Rev AI Jobs API endpoint response object
*/
public function submitAsychronousJobRemote($fileUrl)
{
return json_decode(
$this->client->request(
'POST',
'jobs',
['json' => ['media_url' => $fileUrl]]
)->getBody()->getContents()
);
}
/**
* Submit a local audio file for transcription
*
* @param string $file Path to local file
*
* @return stdClass Rev AI Jobs API endpoint response object
*/
public function submitAsychronousJobLocal($file)
{
return json_decode(
$this->client->request(
'POST',
'jobs',
['multipart' => [['name' => 'media','contents' => fopen($file, 'r')]]]
)->getBody()->getContents()
);
}
/**
* Get transcription job status
*
* @param string $id Transcription job ID
*
* @return stdClass Rev AI Jobs API endpoint response object
*/
public function getAsychronousJobStatus($id)
{
return json_decode(
$this->client->request(
'GET',
"jobs/$id"
)->getBody()->getContents()
);
}
/**
* Get transcription job result
*
* @param string $id Transcription job ID
*
* @return stdClass Rev AI Transcript API endpoint response object
*/
public function getAsychronousJobResult($id)
{
return json_decode(
$this->client->request(
'GET',
"jobs/$id/transcript",
['headers' => ['Accept' => 'application/vnd.rev.transcript.v1.0+json']]
)->getBody()->getContents()
);
}
}
Save the above client as RevAiApiClient.php
.
You can now use this client in a simple application that accepts a local audio file and returns its transcript, as shown below:
<?php
require __DIR__ . '/RevAiApiClient.php';
$token = '<REVAI_ACCESS_TOKEN>';
$file = '<FILEPATH>';
// initialize the Rev AI API client
$client = new RevAiApiClient($token);
// submit a local file for transcription
$jobSubmissionResponse = $client->submitAsychronousJobLocal($file);
// get the job ID and status
$jobId = $jobSubmissionResponse->id;
$jobStatus = $jobSubmissionResponse->status;
echo "Job submitted with id: $jobId" . PHP_EOL;
// check the job status periodically
while ($jobStatus == 'in_progress') {
$jobStatus = $client->getAsychronousJobStatus($jobId)->status;
echo "Job status: $jobStatus" . PHP_EOL;
sleep(30);
}
// retrieve and print the transcript
if ($jobStatus == 'transcribed') {
print_r($client->getAsychronousJobResult($jobId));
}
This example application begins by initializing an instance of the RevAiApiClient
object defined previously, passing the Rev AI access token to the object constructor. It then submits a local file for transcription using the object's submitAsychronousJobLocal()
method. It then uses the getAsychronousJobStatus()
method to repeatedly poll the API every 30 seconds to obtain the status of the job. Once the job status is no longer in_progress
, it uses the getAsychronousJobResult()
method to retrieve the transcript and prints it to the console.
Here is an example of the output generated by the example application:
Job submitted with id: RWviMy7nISeS
Job status: in_progress
Job status: transcribed
stdClass Object
(
[monologues] => Array
(
[0] => stdClass Object
(
[speaker] => 0
[elements] => Array
(
[0] => stdClass Object
(
[type] => text
[value] => 1, 2, 3
[ts] => 0.03
[end_ts] => 2.31
[confidence] => 0.95
)
[1] => stdClass Object
(
[type] => punct
[value] => .
)
)
)
)
)
warning
The example above polls the API repeatedly to check the status of the transcription job. This is presented only for illustrative purposes and is strongly recommended against in production scenarios. For production scenarios, use webhooks to asynchronously receive notifications once the job completes.
Next steps
Learn more about the topics discussed in this tutorial by visiting the following links:
- Documentation: Asynchronous Speech-To-Text API job submission
- Code samples: Asynchronous Speech-To-Text API
- Documentation: Asynchronous Speech-To-Text API best practices
- Tutorial: Get Started with Rev AI API Webhooks