Voice Transcription Pipeline in PHP With Vonage

Published June 19, 2020 by Adam Culp

In this post, you’ll create a voice transcription pipeline. The objective is to use Amazon Transcribe to process an entire conversation into channels and then insert the results into an RDS MySQL database instance. To accomplish this will take two AWS Lambda functions: an HTTP application to retrieve an MP3 file and submit to Amazon Transcribe, and a callback function upon completion of the transcription to store the results into a MySQL database.

Prerequisites

Vonage API Account

To complete this tutorial, you will need a Vonage API account. If you don’t have one already, you can sign up today and start building with free credit. Once you have an account, you can find your API Key and API Secret at the top of the Vonage API Dashboard.

Setup Instructions

Clone the nexmo-community/voice-channels-aws-transcribe-php repo from GitHub, and navigate into the newly created directory to proceed.

Use Composer to Install Dependencies

This example requires the use of Composer to install dependencies and set up the autoloader.

Assuming you have Composer installed globally, run:

AWS Setup

You will need to create AWS credentials as indicated by Serverless.

Also, create a new AWS S3 bucket and make note of the URL for later use.

Link the App to Vonage

Create a Vonage Application Using the Command Line Interface

Install the CLI by following these instructions. You’ll use this to create a new Vonage Voice application that also sets up an answer_url and event_url for the app running in AWS Lambda:

NOTE: You’ll be using <your_hostname> as a placeholder in this command. Later, after you know the URLs provided by deploying to AWS Lambda, you’ll need to update these pieces of the URLs via the Vonage API Dashboard settings for your application.

IMPORTANT: This will return an application ID and a private key. The application ID will be needed for the nexmo link:app command as well as the .env file later. A file named private.key will be created in the same location/level as server.js, by default.

Obtain a New Virtual Number

If you don’t have a number already in place, obtain one from Vonage. This can also be achieved using the CLI by running this command:

Link the Virtual Number to the Application

Finally, link the new number to the created application by running:

Update Environment

Rename the provided .env.dist file to .env and update the values as needed:

NOTE: All placeholders noted by <> need to be updated.

Serverless Plugin

Install the serverless-dotenv-plugin with the following command:

Deploy to Lambda

With all the above updated successfully, you can now use Serverless to deploy the app to AWS Lambda.

Note: Make sure to visit the Vonage API Dashboard and update the answer and event URLs for your application with what is provided by the deployment.

Migrate Transcription to a Database

If you only require the transcription, all is done. However, to automate migrating the transcription results to the database will require another function to be deployed. Clone this nexmo-community/aws-voice-transcription-rds-callback-php repo to another location and follow the instructions in the README to get it up and running. The instructions are identical to what was done above for the first function.

Create The Trigger

After adding the second function, you can navigate to CloudWatch in your AWS Console and select Events and Get Started to create a new Event Rule.

Set the Rule as follows:

  • Event Pattern
  • Build event pattern to match events by service
  • Service Name = Transcribe
  • Event Type = Transcribe Job State Change
  • Specific status(es) = COMPLETED
  • As the Target select the Lambda function #2 created above
  • Scroll down and click Configure Details.
  • Give the rule a meaningful name and description, and enable it.
  • Click Create rule to complete it.

Now you’re ready to test.

Usage

With the deployment completed, you should be able to place a call to your virtual number from any phone. You will hear a message about being connected, and then the recipient number will be called.

After you hang up, the MP3 file will be retrieved from Vonage and uploaded to AWS S3. Following that, a transcription job will be started. The job can be monitored in the AWS Console website after login.

Upon completion of the transcription, CloudWatch will trigger the Lambda function to parse the transcription and insert to the database.

Next Steps

If you have any questions or run into troubles, you can reach out to @VonageDev on Twitter or inquire in the Vonage Developer Community Slack team. Good luck.

Leave a Reply

Your email address will not be published.

Get the latest posts from Nexmo’s next-generation communications blog delivered to your inbox.

By signing up to our communications blog, you accept our privacy policy , which sets out how we use your data and the rights you have in respect of your data. You can opt out of receiving our updates by clicking the unsubscribe link in the email or by emailing us at [email protected].