{DEV IN LOOP

<-

Automating boring tasks

First fact: Robe is back on tour, and rumor says he will sing Extremoduro's songs again. Tour tickets disappear in an instant, and this time due to COVID restrictions, they will open each concert's sale approximately 15 days before the date. So the idea of checking this out automatically to buy on time was at the center of the discussion at a table with some nuts and olives and three more or less bored people looking at a playground.

Second fact: it's Friday, and the plan I had has vanished into thin air. Now I have plenty of time, and this is a short project of something I've not yet tried, so I go back to the computer.

If you are looking for a fast recipe of how to make a Telegram bot tell you when something in a website has changed, so you can take the next step; here it is: call BotFather, create a new bot, add the bot to the group you want to inform, find the token of the bot and the id of the group. Make a script that searches for changes in the website you are interested in, run that script as often as you need. And, when it meets the conditions, post back to Telegram. Tadaaá. Need more details? Keep reading.

Setting up a Telegram bot

This step is pretty straightforward. I searched for BotFather and added it to the contacts.

When it starts, it tells you what you can do. You can use the command /newbot to create it. BotFather asks for the new bot's name, and I call it ConciertoRobeBot (fast and easy). It asks for a username, and when done, I have a message with the token. This token controls the bot, keep it safe!!, they warn you.

I want the messages to arrive at more people, so I use a group, but you can message yourself with this same workflow. Add the bot to the group and write a message to it. Anything will do; just drop some data there.

To make the request, I use PhpStorm, but a browser should work fine too. We are looking for the chat id that returns this request.

https://api.telegram.org/bot{TOKEN}/getUpdates

And now, let's retrieve the information.

Some web scraping

I'm looking for a change in a button that will go from "PRÓXIMAMENTE" to something else. I don't know which word they will use. Anyways, the moment that button changes, I want to know what it says. We have decided to buy from one of 2 more or less nearby concerts, so the search is more restricted.

I will use JavaScript just because I'm used to it. And the structure is as simple as it can work. I assume you have Node in your system.

~ mkdir concertRobeScrapper
~ cd concertRobeScrapper
~ npm init -y
~ npm install axios cheerio dotenv

I use Axios to handle the requests, Cheerio to work and select the classes quickly, and Dotenv cause I want to put the token and chat_id inside a .env file. No linting for this one.

I created the .gitignore and added it to Git.

This file is so simple that I will explain there with comments.

//require the packages
const axios = require('axios');
const cheerio = require('cheerio');
const dotenv = require('dotenv');

//dotenv retrieves the info from the .env file
dotenv.config();

//this is all the data I use
const TOKEN = process.env.TOKEN;
const CHAT_ID = process.env.CHAT_ID;
const SENDTO = 
    `https://api.telegram.org/bot${TOKEN}/sendMessage`;
const url = 'https://robe.es/gira/';
const conciertos = [
    "https://robe.es/murcia/",
    "https://robe.es/alicante/"
]
let data = {
    'chat_id': CHAT_ID,
    'text': '',
    'parse_mode': 'Markdown'
}
let textTocompare = "PRÓXIMAMENTE";

// axios handles the get request to the url
axios(url).then(response => {
    //here we have the whole page and we need to parse it 
    //and look for the button
    const html = response.data;
    const $ = cheerio.load(html)
    let shouldSend = false;
    //this is the class for all buttons, but I need just two
    $('.us_custom_6675df38').each( (index, value) => {
        var link = $(value).attr('href');
        if(conciertos.includes(link)){
            texto = $(value).find('span').first().text();
            if(texto !== textTocompare){
                //the word has changed so I send a message
                shouldSend = true;
                data.text += `${link} cambió a ${texto}`
            }
        }
    });
    if(shouldSend){
        //here im posting the updated data to the bot
        axios.post(SENDTO, data)
    }
}).catch(console.error);

I need to do tests along the way, not that the day comes and something fails.

~ node scraperapi.js

And, of course, it fails. The request returns an error "Sucuri website firewall access denied." So the search begins, and I find that I need to send the headers in the Axios request with the proper cookies and all the info for it to work. Right-click the file in the inspector's net tab and copy cURL; there are the headers I need. I've also found out that the IDE somehow overrides the data, and running the script from the terminal does the trick. Now I get the page, and as the button says what it says (I tampered a bit the condition to make it true always while testing), the message goes to the Telegram group as expected. Green! Now just one more step.

Run the script on schedule

I see two options; I'm sure there are more. Either I do it locally with a crontab, or I use GitHub actions. I will explore both.

~ crontab -l

will show the crontabs in place, and to add a new one:

~ env EDITOR=nano crontab -e

Inside it, the syntax is like this.

* * * * * command

the stars represent: min-hour-day of month - month - day of week

So if I want to log to a file every 10 minutes, I would write:

*/10 * * * * /path/to/script.sh >> /path/to/logs.txt 2>&1

The last bit (2>&1) means that it will log the errors there too.

And don't forget the permissions of the file you want to write, just in case ;)

chmod 755 logs.txt

To run my script every 10 minutes, I can leave the log part out. I've been told that more often would grant me a block.

But to run the crontab, the computer must be on and awake, and one never knows—so second option: the GitHub actions.

GitHub allows you to schedule actions

If you don't know where to start, they give you a template under the action tab in the repo. The thing to mention is that you can use workflow_dispatch to run and test manually.

But what to do with the .env that was, of course, not committed? Well, that's easy when you know it. Under the repo settings, there is a secrets tab, where you can add your variables. To retrieve them during the action's workflow, I created a .env file on the fly and copied them there.

name: Scheduled action
on:
    # Allows you to run this workflow manually from the Actions tab
    workflow_dispatch:
    schedule:
        - cron: '*/10 * * * *'
jobs:
    build:
        runs-on: ubuntu-latest
        steps:
            - uses: actions/checkout@v2
            - name: install dependencies
            run: npm install
            - name: create env file
            run: |
                touch .env
                echo TOKEN = $ >> .env
                echo CHAT_ID = $ >> .env
            - name: run the script
            run: node scraperapi.js

And that's it—three steps for a concert ticket. Now, what can I say about this experience? Let's bring the mic!

The ShortCast #

These are the links that Alexas gave me:

Credits #

Róbert Mészáros was on voice and the production part.

Thanks also to Guido, Alexas, Luis, Pau, Judy, Juan, Giuseppe, Tilo and Kyr for the voice and comments.

  • Automating boring tasks
  • Carmen Maymó