Integrate Rev AI's Topic Extraction API with a Node.js Application

By Vikram Vaswani, Developer Advocate - Jun 13, 2022

Introduction

Topic extraction attempts to detect the topics or subjects of a document. It is useful in a number of different scenarios, including

  • Auto-generated agendas for meetings and phone calls
  • Automated classification or keyword indexing for digital media libraries
  • Automated tagging for Customer Service (CS) complaints or support tickets

Rev AI offers a Topic Extraction API that identifies important keywords and corresponding topics in transcribed speech. For application developers, it provides a fast and accurate way to retrieve and rank the core subjects in a transcribed conversation and then take further actions based on this information.

This tutorial explains how to integrate the Rev AI Topic Extraction API into your Node.js application.

Assumptions

This tutorial assumes that:

Step 1: Install Axios

The Topic Extraction API is a REST API and, as such, you will need an HTTP client to interact with it. This tutorial uses Axios, a popular Promise-based HTTP client for Node.js.

Begin by installing Axios into your application directory:

Copy
Copied
npm install axios

Within your application code, initialize Axios as below:

Copy
Copied
const axios = require('axios');
const token = '<REVAI_ACCESS_TOKEN>';

// create a client
const http = axios.create({
  baseURL: 'https://api.rev.ai/topic_extraction/v1/',
  headers: {
    'Authorization': `Bearer ${token}`,
    'Content-Type': 'application/json'
  }
});

Here, the Axios HTTP client is initialized with the base endpoint for the Topic Extraction API, which is https://api.rev.ai/topic_extraction/v1/.

Every request to the API must be in JSON format and must include an Authorization header containing your API access token. The code shown above also attaches these required headers to the client.

Step 2: Submit transcript for topic extraction

To perform topic extraction on a transcript, you must begin by submitting an HTTP POST request containing the transcript content, in either plaintext or JSON, to the API endpoint at https://api.rev.ai/topic_extraction/v1/jobs.

The code listings below perform this operation using the HTTP client initialized in Step 1, for both plaintext and JSON transcripts:

Copy
Copied
const submitTopicExtractionJobText = async (textData) => {
  return await http.post(`jobs`,
    JSON.stringify({
      text: textData
    }))
    .then(response => response.data)
    .catch(console.error);
};

const submitTopicExtractionJobJson = async (jsonData) => {
  return await http.post(`jobs`,
    JSON.stringify({
      json: jsonData
    }))
    .then(response => response.data)
    .catch(console.error);
};

If you were to inspect the return value of the functions shown above, here is an example of what you would see:

Copy
Copied
{
  id: 'W6DvsEjteqwV',
  created_on: '2022-04-13T09:16:07.033Z',
  status: 'in_progress',
  type: 'topic_extraction'
}

The API response contains a job identifier (id field). This job identifier will be required to check the job status and obtain the job result.

Step 3: Check job status

Topic extraction jobs usually complete within 10-20 seconds. To check the status of the job, you must submit an HTTP GET request to the API endpoint at https://api.rev.ai/topic_extraction/v1/jobs/<ID>, where <ID> is a placeholder for the job identifier.

The code listing below demonstrates this operation:

Copy
Copied
const getTopicExtractionJobStatus = async (jobId) => {
  return await http.get(`jobs/${jobId}`)
    .then(response => response.data)
    .catch(console.error);
};

Here is an example of the API response to the previous request after the job has completed:

Copy
Copied
{
  id: 'W6DvsEjteqwV',
  created_on: '2022-04-13T09:16:07.033Z',
  completed_on: '2022-04-13T09:16:07.17Z',
  word_count: 13,
  status: 'completed',
  type: 'topic_extraction'
}

Step 4: Retrieve topic extraction report

Once the topic extraction job's status changes to completed, you can retrieve the results by submitting an HTTP GET request to the API endpoint at https://api.rev.ai/topic_extraction/v1/jobs/<ID>/result, where <ID> is a placeholder for the job identifier.

The code listing below demonstrates this operation:

Copy
Copied
const getTopicExtractionJobResult = async (jobId) => {
  return await http.get(`jobs/${jobId}/result`,
    { headers: { 'Accept': 'application/vnd.rev.topic.v1.0+json' } })
    .then(response => response.data)
    .catch(console.error);
};

If the job status is completed, the return value of the above function is a JSON-encoded response containing a sentence-wise topic extraction report. If the job status is not completed, the function will return an error instead.

Here is an example of the topic extraction report returned from a completed job:

Copy
Copied
{
  "topics": [
    {
      "topic_name": "incredible team",
      "score": 0.9,
      "informants": [
        {
          "content": "We have 17 folks and, uh, I think we have an incredible team and I just want to talk about some things that we've done that I think have helped us get there.",
          "ts": 71.4,
          "end_ts": 78.39
        },
        {
          "content": "Um, it's sort of the overall thesis for this one.",
          "ts": 78.96,
          "end_ts": 81.51
        },
        {
          "content": "One thing that's worth keeping in mind is that recruiting is a lot of work.",
          "ts": 81.51,
          "end_ts": 84
        },
        {
          "content": "Some people think that you can raise money and spend a few weeks building your team and then move on to more",
          "ts": 84.21,
          "end_ts": 88.47
        }
      ]
    },
    {
      ...
    }
  ]
}

It’s also possible to filter the result set to return only topics which score above a certain value by adding a threshold query parameter to the request.

Step 5: Create and test a simple application

Using the code samples shown previously, it's possible to create a simple application that accepts a JSON transcript and returns a list of topics detected in it, as shown below:

Copy
Copied
const main = async (jsonData) => {
  const job = await submitTopicExtractionJobJson(jsonData);
  console.log(`Job submitted with id: ${job.id}`);

  await new Promise((resolve, reject) => {
    const interval = setInterval(() => {
      getTopicExtractionJobStatus(job.id)
        .then(r => {
          console.log(`Job status: ${r.status}`);
          if (r.status !== 'in_progress') {
            clearInterval(interval);
            resolve(r);
          }
        })
        .catch(e => {
          clearInterval(interval);
          reject(e);
        });
    }, 15000);
  });

  const jobResult = await getTopicExtractionJobResult(job.id);
  console.log(jobResult);
};

// extract topics from example Rev AI JSON transcript
http.get('https://www.rev.ai/FTC_Sample_1_Transcript.json')
  .then(response => main(response.data));

This example application begins by fetching Rev AI's example JSON transcript and passing it to the main() function as input to be analyzed. The main() function submits this data to the Topic Extraction API using the submitTopicExtractionJobJson() method. It then uses setInterval() to repeatedly poll the API every 15 seconds to obtain the status of the job. Once the job status is no longer in_progress, it uses the getTopicExtractionJobResult() method to retrieve the job result and prints it to the console.

Here is an example of the output returned by the code above:

Copy
Copied
Job submitted with id: xgKIzeODYYba
Job status: completed
{
  topics: [
    { topic_name: 'quick overview', score: 0.9, informants: [Array] },
    { topic_name: 'concert tickets', score: 0.9, informants: [Array] },
    { topic_name: 'dividends', score: 0.9, informants: [Array] },
    { topic_name: 'quick background', score: 0.6, informants: [Array] }
  ]
}
warning

The code listing above polls the API repeatedly to check the status of the topic extraction job. This is presented only for illustrative purposes and is strongly recommended against in production scenarios. For production scenarios, use webhooks to asynchronously receive notifications once the topic extraction job completes.

Next steps

Learn more about the topics discussed in this tutorial by visiting the following links: