Play an Audio File into a Voice Call with PHP

Play an Audio File into a Voice Call with PHP

Published April 12, 2019 by Michael Heap

In addition to making text-to-speech calls, the Nexmo voice API allows you to play prerecorded audio files in to a call.

This can be used for good (to provide a more human sounding prompt when building an IVR) and for evil (playing Never Gonna Give You Up).

In this post, we’ll be focusing on the good by building an application that welcomes the caller using the stream action in an NCCO and updates them about their position in the queue at a regular interval using the REST API.

All of the code for this post is available on Github

Prerequisites

You’ll need PHP installed before working through this post. I’m running PHP 7.3, but the code here should work on PHP 5.6 and above. You’ll also need Composer available to install the Nexmo PHP client.

You’ll also need a Nexmo Account and the Nexmo CLI installed. We’ll be using the CLI to configure our Nexmo account and purchase a phone number.

Play an Audio File Into an Incoming Call

The first thing we need to do is install all of our dependencies and bootstrap a project. We’re using the Slim framework to handle the incoming request, so let’s install it now with composer:

Nexmo will make a GET request to your application when an incoming call is received. Let’s create a new Slim application and register a handler that responds with an empty JSON array to any request made to /webhooks/answer. This is the path that we’ll provide Nexmo with when we configure our Nexmo application later in this post.

Create a file named index.php with the following contents:

This application will handle the incoming request and respond to Nexmo with an empty array, which will end the incoming call. We need to tell Nexmo to stream an audio file in to the call by returning a Nexmo Call Control Object (NCCO) that contains a stream action. Replace $ncco in your code with the following:

You can test your application by running php -t . -S localhost:8000 and visiting http://localhost:8000/webhooks/answer in your browser. You should see some JSON returned.

Exposing Your Application with Ngrok

Now that we have an application it’s time to make it accessible to the internet so that Nexmo can make a request to it. To achieve this, we’ll be using ngrok. Run ngrok http 8000 and make a note of the URL generated (it’ll look something like http://abc123.ngrok.io). We’ll need this URL in the next step when we configure our Nexmo application.

Configure Your Nexmo Account

So far, we’ve built an application and exposed it to the internet, but we haven’t told Nexmo where our application lives. To do this, we need to create a Nexmo application and set the answer_url and event_url. Run the following in the same directory as index.php, replacing example.com with your ngrok URL:

This will create a file named private.key and return an application ID in the terminal. The private.key is your authentication credentials for making a request to the Nexmo API (which we’ll use later) and the application ID is needed for both authentication and configuration.

Now that we have an application, we need a way for a user to connect to it. This is done by purchasing a phone number and linking it to the application. Purchase a number by running nexmo number:buy --country_code US --confirm. Make a note of the number purchased (you can change US to any country). Finally, link this number to your application by running nexmo link:app <number> <application_id>. Now, whenever someone makes a call to the number you purchased Nexmo will make a request to /webhooks/answer in your application.

Test Your Application

At this point your application will work! Call the number that your purchased earlier and it should stream the audio file in streamUrl to you before ending the call.

Placing a Call On Hold

Now that we can handle an incoming call, it’s time to finish building our application. After playing the introduction message we want to place the user on hold and periodically update them on their position in queue.

To place the user on hold, we can add them to a conference call with only them in it using the conversation action . This will keep the line open without connecting them to another user. Conference names must be unique within an application, so let’s use the caller’s phone number as the name.

We’ll also need to capture the ID of the call so that we can call the Nexmo API to play audio back in to the audio stream. This will appear in the terminal in the window that you ran php -t . -S localhost:8000 in the following format:

Replace your /webhooks/answer endpoint with the following:

If you call your Nexmo number again now you’ll hear the introduction message followed by silence and see the number of the phone you’re calling from logged in the terminal.

Stream a File Into an Active Nexmo Call

The last thing to do is update the user on their position in the queue. To do this we’ll make a request to the Nexmo API’s /stream method . All requests to the API must be authenticated so let’s install and configure the nexmo PHP client library.

Next, add the following to index.php just before $app = new \Slim\App;, replacing NEXMO_APPLICATION_ID with the application ID you made a note of earlier:

There are lots of different ways to play the audio update in to a call, but to keep it easy for this post let’s add another endpoint to our application that we can use to trigger it manually. We’ll create a GET endpoint for easy testing (though as it has a side effect it should be a POST endpoint in production).

This endpoint has a few responsibilities:

  • Collect the call ID and current position in the URL
  • Check that the position provided is valid (in this app it must be 1, 2 or 3)
  • Make a request to the Nexmo API with the URL to play in to the call

Let’s give it a go! Add the following underneath your /webhooks/answer endpoint:

Call your number to hear the welcome message and collect the call ID from your server logs. Once you have that, make a request to http://.ngrok.io/trigger//3 to tell the user that they are at position number 3 in the queue, then a request to http://.ngrok.io/trigger//2 to inform them that they’re second in the queue, and so on.

The updates can be automated by hooking in to other parts of your real-world application – you don’t need to make requests to this endpoint manually.

The final part to the puzzle is to take the caller off hold and connect them to an agent. You can achieve this by making a an API call to transfer a call to a new NCCO, and return a connect action in that NCCO containing the phone number that you want the caller to be connected to. I’ll leave writing the code for that bit as an exercise for you.

Conclusion

In this post we’ve played an audio file in to a call using both an NCCO and the Nexmo REST API. For most use cases, using an NCCO is the better option as you don’t need to keep track of the call ID. You may choose to use the REST API if you have a sensitive audio file to stream or you need to play audio in at a specific point in the call.

If you have any questions about this post feel free to email [email protected] or join the Nexmo community Slack channel, where we’re waiting and ready to help.

Leave a Reply

Your email address will not be published.