Question 2
…continued: Let’s add some debugging
import requests
from bs4 import BeautifulSoup
url = 'https://vkatsikaros.github.io/dataharvest24-www.github.io/'
response = requests.get(url)
if response.status_code == 200:
print('Retrieving the webpage. Status code:', response.status_code)
soup = BeautifulSoup(response.content, 'html.parser')
titles = soup.find_all('h2')
for title in titles:
print(title.get_text())
else:
print('Failed to retrieve the webpage. Status code:', response.status_code)
The diff. Adding “poor man’s” debugging:
if response.status_code == 200:
+ print('Retrieving the webpage. Status code:', response.status_code)
soup = BeautifulSoup(response.content, 'html.parser')
titles = soup.find_all('h2')
Output:
Retrieving the webpage. Status code: 200
Ok, so it’s downloading the page. Maybe there is no <h2>
in the page? Let’s find out how!
⇦ question 1 | Index | dev tools 1 ⇨ |