Web_scraper

Read web pages that don't provide RSS feed or APIs (by scraping html)

Neuron documentation

Video

Synapse

            
- name: "Programme-tv"
  signals:
    - order: "qu'est-ce qu'il y a à la télé ce soir"
  neurons:
    - say:
        message: "Probablement rien, mais je regarde"
    - web_scraper:
        url: "http://tvmag.lefigaro.fr/programme-tv/ce_soir_la_tv.html"
        main_selector: "div.tvm-grid-channel__prog"
        title_selector: "div.tvm-grid-channel__logo i"
        description_selector: "div.tvm-grid-channel__part1 h3"
        file_template: "templates/fr_programme_tv.j2"

- name: "Cinema-program"
  signals:
    - order: "qu'est-ce qu'il y a au cinéma en ce moment"
  neurons:
    - say:
        message: "Je regarde"
    - web_scraper:
        url: "http://www.allocine.fr/film/aucinema/"
        main_selector: "section#content-start div.card-entity-list"
        title_selector: "h2.meta-title"
        description_selector: "div.meta-body div.meta-body-actor"
        file_template: "templates/fr_cinema_program.j2"
    - web_scraper:
        url: "http://www.allocine.fr/film/aucinema/?page=2"
        main_selector: "section#content-start div.card-entity-list"
        title_selector: "h2.meta-title"
        description_selector: "div.meta-body div.meta-body-actor"
        file_template: "templates/fr_cinema_program.j2"

        

Template

            
# fr_cinema_program.j2
{% if returncode != 200 %}
    Erreur lors de la récupération des programmes
{% else %}
    Monsieur, en ce moment au cinéma
    {% for g in news: %}
        {{ g['title'] }} : {{ g['content'] }}
    {% endfor %}
{% endif %}


# fr_programme_tv.j2
{% if returncode != 200 %}
    Erreur lors de la récupération des programmes
{% else %}
    Monsieur, ce soir à la télévision:
    {% for g in news: %}
        {{ g['title'] }} : {{ g['content'] }}
    {% endfor %}
{% endif %}