"SSL: certificate_verify_failed" error when scraping https://www.thenewboston.com/

Question 1

"SSL: certificate_verify_failed" error when scraping https://www.thenewboston.com/

python ssl web-scraping ssl-certificate

Bill Jenkins · Dec 29, 2015 · Viewed 87.2k times · Source

Answer

Answer

You can tell requests not to verify the SSL certificate:

>>> url = "https://www.thenewboston.com/forum/category.php?id=15&orderby=recent&page=1"
>>> response = requests.get(url, verify=False)
>>> response.status_code
200

See more in the requests doc

Question 2

So I started learning Python recently using "The New Boston's" videos on youtube, everything was going great until I got to his tutorial of making a simple web crawler. While I understood it with no problem, when I run the code I get errors all seemingly based around "SSL: CERTIFICATE_VERIFY_FAILED." I've been searching for an answer since last night trying to figure out how to fix it, it seems no one else in the comments on the video or on his website are having the same problem as me and even using someone elses code from his website I get the same results. I'll post the code from the one I got from the website as it's giving me the same error and the one I coded is a mess right now.

import requests
from bs4 import BeautifulSoup

def trade_spider(max_pages):
    page = 1
    while page <= max_pages:
        url = "https://www.thenewboston.com/forum/category.php?id=15&orderby=recent&page=" + str(page) #this is page of popular posts
        source_code = requests.get(url)
        # just get the code, no headers or anything
        plain_text = source_code.text
        # BeautifulSoup objects can be sorted through easy
        for link in soup.findAll('a', {'class': 'index_singleListingTitles'}): #all links, which contains "" class='index_singleListingTitles' "" in it.
            href = "https://www.thenewboston.com/" + link.get('href')
            title = link.string # just the text, not the HTML
            print(href)
            print(title)
            # get_single_item_data(href)
    page += 1
trade_spider(1)

The full error is: ssl.SSLError: [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)

I apologize if this is a dumb question, I'm still new to programming but I seriously can't figure this out, I was thinking about just skipping this tutorial but it's bothering me not being able to fix this, thanks!

"SSL: certificate_verify_failed" error when scraping https://www.thenewboston.com/

Answer

Related questions