Question 2

…continued: Let’s add some debugging

import requests
from bs4 import BeautifulSoup

url = 'https://vkatsikaros.github.io/dataharvest24-www.github.io/'
response = requests.get(url)

if response.status_code == 200:
    print('Retrieving the webpage. Status code:', response.status_code)
    soup = BeautifulSoup(response.content, 'html.parser')
    titles = soup.find_all('h2')
    
    for title in titles:
        print(title.get_text())
else:
    print('Failed to retrieve the webpage. Status code:', response.status_code)

The diff. Adding “poor man’s” debugging:

if response.status_code == 200:
+    print('Retrieving the webpage. Status code:', response.status_code)
    soup = BeautifulSoup(response.content, 'html.parser')
    titles = soup.find_all('h2')

Output:

Retrieving the webpage. Status code: 200

Ok, so it’s downloading the page. Maybe there is no <h2> in the page? Let’s find out how!


⇦ question 1	Index	dev tools 1 ⇨