Scrape the Web for Updates and Send SMS Alerts with Ruby

Published March 27, 2020 by Ben Greenberg

We have all been there. You wake up and it is a new day, and you just need to know the answer to the most burning question. Is it the weekend yet? Your week has been so busy with professional and personal responsibilities. All you want to do is take a couple of days to sit back and relax.

To answer the question you could, of course, open up your calendar app on your phone or ask your favorite personal digital home assistant. But, why do those things when you could build yourself an app that sends you a text message instead?

We are going to create a Ruby on Rails application that does the following:

  • Allows for both new subscribers and for the ability to unsubscribe from the list
  • Scrapes the answer to our question from isittheweekend.com
  • Sends the answer from our scraped data daily to all the subscribed recipients

To wrap it all together, we will also be creating a Rake task that will run all these tasks at once, and designating its execution once every 24 hours.

If you prefer, you can also find a fully working version of this application on GitHub.

Let’s get started!

Prerequisites

Generate the Rails Application

The first thing we need to do is to create our new Rails application. From your command line execute the following:

This creates the necessary file structure for our Rails application and sets the default database to PostgreSQL.

Once that is done, cd into the directory that was created. Before we install our dependencies, we will add the additional gems our application will use.

Open up the code in your preferred code editor and navigate to the Gemfile. Inside the Gemfile add the following gems:

We are using the nexmo gem to send the SMS updates, the watir and webdrivers gems to make the HTTP request to a site with dynamic JavaScript content, the whenever gem to schedule the Rake task, and the dotenv-rails gem to manage the environment variables.

After you have saved the Gemfile, you are ready to run bundle install from the command line.

The next step is creating our database schema and models.

Create the Database Schema and Models

Now that our Rails application is created and has its dependencies installed, the next task is to create the correct database schema to house the data we will need to operate our application. We need to store the following types of information:

  • Recipients: The list of subscribers with their phone numbers
  • DiffStorage: Copies of the website data to compare against to determine if there was a change

We will use the Rails generator tool to create the migration files and then subsequently edit each one.

Those commands will create both model files in app/models and migration files in db/migrate. Before committing those changes to your application, once the generator actions are done, inspect the files created in both directories to make sure they are correct.

Specifically, in the migration files, you want to ensure that each migration includes t.timestamps, which adds a created_at and updated_at column to the table. You should also see the number and subscribed columns in the Recipient migration file, with types set to string and boolean, respectively. Similarly, you should see a column in the DiffStorage migration file for website_data with the type set to text.

The model files inside app/models should be empty besides the class declarations and their inheritance from ApplicationRecord.

When it looks satisfactory, it is time to run rake db:migrate from the command line. The command will output the results to your console, and if you inspect the db/schema.rb file you will be able to see the schema you created initialized inside the application.

Lastly, we also need to create Messenger and Scraper models, but we do not need a migration for them. To do so, we run the Rails generator again and append a --migration=false flag to it:

Now has come the time to define the logic inside the models.

Defining the Models

As mentioned above, we have four models that are responsible for unique areas of the application:

  • DiffStorage: Checks for any differences in the website data
  • Recipient: Manages adding and removing subscribers
  • Messenger: Manages the sending of SMS messages
  • Scraper: Responsible for scraping the website for data

Defining the DiffStorage Model

The DiffStorage model will contain two class models. One will contain the URL that we are scraping. The second will check for any changes since the last time the website was scraped and invoke the next steps in the application when the conditions are met.

First, let’s define the URL in its own method so that we create a single place where it exists and can be easily modified if we choose to do so later:

Next, the bulk of this model will live inside the #check_last_record class method:

The above method first calls the method in the Scraper class that will begin the website scraping to obtain the most recent snapshot and assigns that data to today_answer. It then wraps the next step inside an if statement asking if there are any records in DiffStorage. If there are previous records stored there, then the method grabs the most recent one and assigns it to yesterday_answer. If there were no previous records then an empty string is assigned to yesterday_answer. Lastly, it sends the recipients and the two variables to the Messenger model to process for sending the message.

Defining the Scraper Model

The Scraper model will be responsible for doing the work of gathering the data from isittheweekend.com to determine if it is indeed the weekend or not. The model will have four class methods and we will define each one here:

Each action within the act of scraping is defined into its own small method so as to keep our concerns separated. The #call method is the point of entry for the class. This is what gets invoked by other methods outside of itself. The #get_url method makes the HTTP request by simulating a Chome browser request using the Watir library and parses it with Nokogiri. The #check_text method checks if any data was obtained. The #store_text method saves that data to the database.

Defining the Messenger Model

Within the Messenger model will be all the code responsible for sending the daily SMS update to the subscribers. We will create a method that will send the message, a method that composes the weekend response text, a method that puts the whole message together, and a method that manages a confirmation message if a subscriber sends a removal request.

First, the method to send the update message:

The value for the text parameter above refers to a class method called #weekend_message. This method will compose the string for the weekend update by checking if today is the same as yesterday or not:

Next, the method containing the HEREDOC string with the body of the message:

Finally, the method to send a removal confirmation message:

The last model we need to define before we continue to the next step is the Recipient model.

Defining the Recipient Model

This model does not contain any of its class methods. The only addition we will make to this model is adding two validations to the data for recipients. These validations will act as a safeguard when adding new phone numbers to the database. We will check that a) a number is indeed being provided in the data and b) the number is not a duplicate of an already existing record. To do these validations we add two lines under the class definition:

Create the Controller and Routes

We are getting close to finishing the construction of our app! The next step is defining the controller actions that will dictate the flow of the application. First, let’s generate the controller using the Rails generator from the command line:

This will create a new empty controller file in app/controllers called weekend_checker_controller.rb and complementary view files in app/views/weekend_checker. We will add an index view shortly. At this point, we’ll focus on the controller.

The controller needs three actions to correspond to three routes: #index, #create and #event. The #index route will be the default and only view for our website. That will be the place where individuals can subscribe to the list. The #create route will be where new numbers get processed. Finally, the #event route will be where the application receives webhook data from the SMS API, including removal requests, and processes them.

These three controller actions need three corresponding routes defined in config/routes.rb:

The penultimate item for our application code setup is creating a basic view for the / route.

Defining the View

In order to subscribe to the SMS list, we will create a view accessible at the top-level of the URL that will contain a sign-up form.

Inside the app/views/weekend_checker folder add an index.html.erb file. It will contain the following code:

The final coding task we have to do is to set up our new Rake task that will run all this code and configure the whenever gem to execute the Rake task once a day.

Create the Rake Task and Schedule It

Once again, we will use a Rails generator from the command line to create the file for our Rake task. From the command line run the following:

The above task will create a file in lib/tasks called scraper.rake. When we open it inside our code editor it will look like this:

Let’s redefine the desc with a short string of what this task will do: desc "Check Website for Any Updates". Next, inside the task block add the DiffStorage#check_last_record class method, which is the entry point for all the work we created previously:

Now that our Rake task is defined, we lastly need to initialize the whenever gem and let it know that we want this task run once a day. To do that, first, we run the initializer command for the gem from the command line:

The above command creates a schedule.rb file inside the config/ folder. Add the following code to that file to run the scraper:check_site_update task daily:

Now that the schedule is created, we need to update the crontab file on our machine to know about this new job. We do that by running bundle exec whenever --update-crontab from the command line. Once that is done, the task is fully initialized and configured to run once a day on our machine.

The code for our application is all set. The only thing that is missing now is creating our Nexmo account, obtaining our Nexmo API credentials and provisioning a virtual phone number to send the daily text messages with. Once we have this information we will add it to our application as environment variables.

Nexmo API Credentials and Phone Number

To create an account navigate to the Nexmo Dashboard and complete the registration steps. Once you have finished registering, you will enter your Dashboard, which will look similar to this:

If you have not done so previously, create a .env file in the top-level directory of your application and add your NEXMO_API_KEY and NEXMO_API_SECRET to it. The values for those are found at the top of the dashboard page under the Your API credentials header.

The next task we need to do inside the dashboard is to provision a phone number. After you click on the Numbers link on the sidebar navigation, a drop-down menu will appear. Once you select the Buy numbers option and then click the Search button you will see a list of possible numbers to acquire:

When searching for numbers by feature, country, and type, it is recommended to select the country that your users will be based in, SMS for features and Mobile for type.

After clicking the orange Buy button for the number you wish to purchase, you can add that number to your .env file as a new variable called FROM_NUMBER:

The last item we need to do inside our dashboard is to provide an externally accessible URL as the event webhook for the phone number. For development purposes, ngrok is a good tool to use, and you can follow this guide on how to get up and running with it.

From the dashboard Numbers sidebar navigation drop-down, once you select Your numbers you will see your newly provisioned phone number in a list presentation. After clicking on the gear icon to manage its properties, a settings dialog menu will be shown to you:

In the above screenshot example, you would replace the Inbound Webhook URL text field with your own URL ending with /webhooks/event.

That’s it! Our code is all finalized and our Nexmo credentials are all set. At this point, you face a choice for running your application. You can either run it locally or you could deploy it to an external hosting provider, like Heroku. In the final step, we will discuss how to run it locally.

If you are interested in deploying it for a more long-term solution, you can visit the GitHub repository and click on the Deploy to Heroku button at the top of the README to start that process.

Running the Application

We are now ready to run our brand new application! In order to run it locally, the Rails event webhook needs to be accessible to the outside world outside of your local environment. For example, if you are using ngrok after following this guide then both the Rails application and ngrok need to be running simultaneously.

To start the Rails application execute the following from your command line:

Then you can navigate in your browser of choice to localhost:3000. You will see the sign-up form you created. Go ahead and fill it out with your phone number and submit it. Now, once the Rake task is run, you should expect to receive an SMS letting you know if it is the weekend and whether today is different than yesterday!

Next Steps

The application we built while whimsical demonstrates the potential for leveraging web scraping and SMS to create an application that delivers updates to subscribers. There are countless potential use cases for an application like this. Whether you are interested in replicating this exact scenario or in porting the code for your own use case, there is, even more, to explore on this topic.

For further exploration of other SMS possibilities check out the following resources:

Leave a Reply

Your email address will not be published.

Get the latest posts from Nexmo’s next-generation communications blog delivered to your inbox.

By signing up to our communications blog, you accept our privacy policy , which sets out how we use your data and the rights you have in respect of your data. You can opt out of receiving our updates by clicking the unsubscribe link in the email or by emailing us at [email protected].