Press 1 for Twitter

Building a Twitter IVR with text-to-speech and the Nexmo Voice API

Published June 26, 2018 by Aaron Bassett
Categories:

I’m a bit of a Twitter addict. Like many other techies, I joined when Twitter exploded in popularity at SXSW in 2007, and it still really shines as a way to keep track of what’s going on at festivals or conferences.

But conference internet is usually not great. You could still Tweet via SMS, but without a data connection you can’t follow the conference hashtag or search for tweets nearby to find out where to find the best after-party. But even if you don’t have data, you can probably still make calls. Let’s set up a Twitter bot we can control via a phone call and have it read out the tweets to us!

Before we get started

There are a few things you’ll need before we get started.

  1. Nexmo account with a virtual number
  2. Twitter account and a Twitter application; the Python Twitter library has some good documentation on creating a Twitter application
  3. If you’re running this locally, you’ll need a way of making your server public. We suggest using ngrok.

Controlling our bot with DTMF

There might be several hashtags we want to follow, or perhaps we want to be able to check the latest tweets from a few different accounts. So we need some way of telling our bot which stream to read to us.

For this, we’re going to use Dual-tone multi-frequency signalling (DTMF). When you dial a number on your phone that tone you hear is DTMF. With the Nexmo input action we ask the user to press a particular number and then vary the action we take based on the number entered. If you’ve ever had to call a customer support line before you’ve likely encountered an interactive voice response (IVR) system, which is what we’re about to create.

When a user calls, we’ll read out a list of the available inputs, prompt them to make a selection, and then use the Twitter API and text-to-speech to read the relevant tweets to them.

Creating an interactive voice response system with Python and Flask

When a user calls our virtual number, Nexmo will request our Nexmo call control object (NCCO). The NCCO is a JSON file which contains a list of actions Nexmo should perform when someone calls our number. Let’s look at an example.

The NCCO above has two actions: First, we use text-to-speech to let the caller know what their options are; we set bargeIn to true so that the caller can press the number at any time, without listening to the entire message, if they already know which option they want. The next action captures the number the user presses and POSTs it to our eventUrl. This eventUrl will return another NCCO, which will tell Nexmo what to do next.

Converting tweets to speech

When our caller inputs a number it is sent as part of a POST request to our /ivr/ endpoint. Using a few conditionals (come on Guido just let us have a switch statement already!) we can compare the user’s input to our list of possible tweets and return each tweet as a talk action in our new NCCO. We’ll use text-to-speech to read the tweets out to the user.

If the user has entered a value we don’t recognise, we play a short message telling them that their input was not understood, and then we start the process over again.

Try it for yourself

view on Github

The example code uses Python and Flask, so I recommend you create a new Python virtual environment and then you can clone the code and install the dependencies.

There is a little configuration required; in the app.py set the event_url to your ngrok URL followed by /ivr/, and ensure to set the required Twitter variables in your environment. You can find details about how to create a new Twitter application in the python-twitter documentation.

Once you have all the required variables set you can run the Flask application. Let’s start it in development mode; this way we’ll get some nice debug output if anything goes wrong. You’ll need to create a couple more environmental variables for Flask.

And once they are set we can run our application using:

Try visiting http://127.0.0.1:5000 in your web browser. If everything is running correctly, you should see our NCCO. But to make this server reachable by the Nexmo Voice API, we’ll need it to be public. So ensure ngrok is running and pointing at the correct port.

You will also need to configure your Nexmo voice application. The easiest way to do this is via our Voice application management section of the Nexmo Dashboard. The Event URL does not matter in this example as we won’t be working with any webhooks so set it and your Answer URL to your ngrok URL.

screenshot of Nexmo voice application screen

Once you have created/configured your voice application don’t forget to link a telephone number to it!

Give it a try

To try it out simply call the Nexmo virtual number you linked to your new voice application. You should hear the introductory message with your different options. Try entering different numbers or even a number which is not recognised and listen for the different messages you get back.

Using your own data sources

Edit the example and change the Twitter accounts or hashtag you want to retrieve tweets from, but don’t forget to update your introductory message to reflect your changes.

Of course, you don’t have to pull your messages from Twitter. You can use any data source you like, find out how your stocks are performing, always have a handy supply of dad jokes, or even check what’s at the top of Hacker News.

In fact, you can execute any code you like. We’ve all heard of ChatOps but what about IVROps?

Press 1 to switch it off and back on again

IT Crowd: Have you tried switching it off and back on again?

Although you might want to build some authentication into that one…

Leave a Reply

Your email address will not be published.