How to Take a Website Screenshot With Python and Bypass Cloudflare in 3 ways (Free Source Code)

Website Screenshot with Python

Table of Contents

Spread the love

This Project of taking a Website Screenshot with Python is for anyone, at any skill level who wants to have some fun with python. We’ll be using Python & Selenium.

So, whether you’re a beginner looking to expand your knowledge in python, by doing a hands-on project, or a professional who is just interested in exploring possibilities and trying new things.????

This Python Project is for you! Let’s get into it…

First, let’s answer a few questions and discuss the basic flow of this project.

How to take a Website Screenshot with Python?

A popular way to capture a website screenshot with python is to use python with selenium. Selenium has several functions, and one of the most significant and useful is the ability to take screenshots. In order to take a screenshot of a webpage, mostly the save_screenshot() method is used.

How to Bypass Cloudflare while taking a Website screenshot with Python and Selenium?

We can Bypass Cloudflare by using the user_agent argument. But this works sometimes. So, to really be safe and make our project work every time, we use selenium stealth. The details are explained in this Blog Post.

Also as a beginner to this Project, these questions may also arise in your head.

How do I bypass Captcha in Cloudflare?

We discuss 3 simple ways to take website screenshots with python and bypass Captcha in Cloudflare. First, the save_screenshot method, then the user_agent argument, and then finally with selenium stealth.

How do I bypass Cloudflare bot protection in selenium?

This can be done using Selenium Stealth. As it passes all Bot tests and maintains a regular software captcha V3 score. The Detail is discussed in this Blog Post.

Have more questions?

Don't worry! We'll answer all your Questions in Detail.????

Here’s what we’ll be covering in detail Step-by-Step:

1. Downloading & Setting up (Python + VS Code)

2. Selenium Project for Website Screenshot with Python, and Python code to take screenshots of that webpage

If you have prior knowledge of python or have already set up Python with a relevant code editor. Then you can move directly to step 2. 

All others follow me to step one.

Setting up Python and Visual Studio Code

To take a Website Screenshot with Python, we need everything set up. Most importantly Python and a Code Editor, to be set up properly.

Here is a tutorial on How To Download And Install Python With Visual Studio Code (2022)

Follow it and Come back to this post????

Now I hope you've followed the above tutorial, and have everything set up. Let's move on!

Now before I move on and show you how to take a webpage screenshot with Python and Selenium, let us confirm that everything is working fine on your side.

By creating a New File, and printing Hello World on the screen.

Create a New Python File to Print “Hello World”

Now to see for a final time that everything is working properly.

Go to your Visual Studio Homepage.

Click on ‘New File

Then go ahead and save it in your desired location. (can be desktop or any Folder)

And you should have a running python file just like this.

Let’s Print Hello World in the output.

#
print("Hello World")
#

Like This ????

Printing hello world with python

Just Go to the Top left ‘Menu Bar’ and click on File, then select the Programming language as python. Press Ctrl+S or Cmd+S to save the file.

Now select where you want to save the file(choose the directory), give it a suitable name, and it will be saved.

Now I wrote a simple print command where we’re printing “hello world” on the screen using the print statement in Python.

After writing the print statement, as you see a play button on the top right corner select the drop down right next to it and select Run Python file.

And as you can see the output has successfully been displayed in the terminal.

Now that all prerequisites are out of the way.

Let’s move on to the Actual Coding Part.

How To Make a Website Screenshot With Python and Bypass Cloudflare

The Process is done with Python and Selenium, and some patience.????

If you prefer a quick video tutorial. Feel Free to watch this one????

How to take a Website Screenshot with Python

Below is the Final Source Code for taking a Website Screenshot with Python and Selenium While also Bypassing Cloudflare, and most major bot tests.

You can Download it Here or Check it out on Github.

website screenshot with python

Let’s discuss everything step-by-step

I hope you have the same file opened already. The file in which we printed Hello World.

We’ll start with the same file.

Method 1. Website Screenshot with Selenium using the save_screenshot method

In this section, let's write a Python Script that captures screenshots of any website without using any third parties or APIs.

We will use selenium with some extra tweaks to try bypassing any captchas or Cloudflare security measurements.

Selenium has several functions, and one of the most significant and useful is the ability to take screenshots. In order to take a screenshot of a webpage save_screenshot() method is used. The save screenshot method allows the user to save a png file of any webpage opened but the web driver.

#
Basic Syntax : driver.save_screenshot("screenshot.png")
#

So, how is it possible to capture screenshots of any given website?

We’ll be using Selenium web Driver, to open the web page, to capture it, and then save it.

Code Explanation

So, for using Selenium Web Driver we need to import it by;

#
from selenium import webdriver
#

If you don’t have Selenium Installed…

Go to the top menu bar, and hover on the terminal. And from the dropdown select a new terminal. Now type PIP install Selenium in the terminal and press enter.

Selenium will be installed in a few moments. Like this????

Now we are going to include something known as Chrome options, so we can pass arguments and relevant commands to the WebDriver.

#
from selenium.webdriver.chrome.options import Options
#

For measuring how many seconds our code takes to capture a screenshot from a particular website, we’ll be using the time

#
import time
#

Let’s declare a new variable to keep track of the time, using the time function like this.

#
start_time = time.time()
#

Now let's add some options to our Chrome browser.

So, define a new Options object.

#
options = Options()
options.add_argument('--headless')
#

And an argument called headless. 

What headless means is that it will open the website within the code, and will not apparently open the website on the screen.

options.add_argument(‘–headless’)

Now define the web driver which you’re going to launch or run for taking a screenshot.

#
driver = webdriver.Chrome("E:\TechVideos\Python\chromedriver.exe" , chrome_options=options)
#

Above we Defined a new driver called the Chrome method in the webdriver model, and the first parameter is the path of the chrome driver.

Download Chrome Driver

Also, you have to download the chrome driver executable for this to Work.

To download the Chrome Driver, you have to go here????

https://www.chromedriver.chromium.org/downloads

From here you have to select a version of the driver that matches the version of the chrome browser you’re currently using.

For example, If my chrome browser version is 100, then I’ll download the chromedriver version 100

Below is my current version of Chrome.

And I’ll download this version of the Driver

After downloading the Chrome Driver, save it in a location where it is easy to access. Then Copy that path and give it in the argument. Like this????

#
driver = webdriver.Chrome("E:\TechVideos\Python\chromedriver.exe" , chrome_options=options)
#
Website screenshot with python

Chrome Driver Done. Let’s Move back to the Code Explanation.

Code Explanation

Now let’s set up a URL form which we want to take a screenshot.

#
url = "https://learnwithhasan.com"
#

Now for opening the URL

#
driver.get(url)
#

After Opening, we’ll take the screenshot with the following method.

Remember to pass your name, with which you want your screenshot to be saved, in the arguments of the following method.

#
driver.save_screenshot("test.png") # i want my screenshot to be saved as test.png
#

Then we want to calculate the time elapsed for the overall process of taking the screenshot.

#
elapsed = "%s seconds" % (time.time() - start_time) # the time now minus the start time
#

Now let’s print the elapsed time on the screen

#
print("Done in” + elapsed)
#

Source Code for Python & Selenium Screenshot with save_screenshot method

#start
from optparse import OptParseError
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
 
start_time = time.time()
 
options = Options()
options.add_argument('--headless')
 
driver = webdriver.Chrome(
    "E:\TechVideos\Python\chromedriver.exe" , chrome_options=options)
 
url = "https://learnwithhasan.com"
 
#Now for opening the URL
 
driver.get(url)
 
#After Opening we’ll take the screenshot with the following method.
 
#Remember to pass your name, with which you want your screenshot to be saved, in the arguments of the following method.
 
 
driver.save_screenshot("test.png") # i want my screenshot to be saved as test.png
 
#Then we want to calculate the time elapsed for the overall process of taking the screenshot.
 
elapsed = "%s seconds" % (time.time() - start_time) # the time now minus the start time
 
#Now let’s print the elapsed time on the screen
 
print("Done in" + elapsed)
 
#end

The output of ScreenShot with save_screenshot

So, we successfully took a website Screenshot. But this website was not protected by Cloudflare or any kind of Protective measures.

What if a website has Cloudflare Protection. Let’s discuss a workaround for that.

Method 2. Bypass Cloudflare for Website Screenshot with Selenium using user_agent()

Let’s take another example for taking a website screenshot.

But this time on a Cloudflare Protected website.

So replace https://learnwithhasan.com with https://www.neilpatel.com

#
url = "https://www.neilpatel.com"
#

And study the Output????

This means this website is protected by Cloudflare, Software reCapthca, etc.

So, how to solve this problem?????

Let’s see…

Now we will add an argument, known as a user agent.

Like this????

#
options.add_argument(
    'user-agent-Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'
)
#

This user agent will tell the website in some way that we are a normal users and not a bot.

Let's try if it works.

????And it worked! 

We successfully bypassed Cloudflare with the user agent argument.

Source Code for Python & Selenium Screenshot with user_agent() argument

#start
from optparse import OptParseError
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
 
start_time = time.time()
 
options = Options()
options.add_argument('--headless')
options.add_argument(
    'user-agent-Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/74.0.3729.169 Safari/537.36'
)
 
driver = webdriver.Chrome(
    "E:\TechVideos\Python\chromedriver.exe" , chrome_options=options)
 
url = "https://www.neilpatel.com"
 
#Now for opening the URL
 
driver.get(url)
 
#After Opening we’ll take the screenshot with the following method.
 
#Remember to pass your name, with which you want your screenshot to be saved, in the arguments of the following method.
 
 
driver.save_screenshot("test.png") # i want my screenshot to be saved as test.png
 
#Then we want to calculate the time elapsed for the overall process of taking the screenshot.
 
elapsed = "%s seconds" % (time.time() - start_time) # the time now minus the start time
 
#Now let’s print the elapsed time on the screen
 
print("Done in" + elapsed)
 
#end

The output of ScreenShot with user_agent

But in some cases even if you add this, the cloudflare or browser’s firewall will still detect you as a bot.

So, how can we be on the safe side?

What can we use to stay strong and Bypass almost 99% of the security measurements?

Let’s use our Final, and Most Effective method

Method 3. Bypass Cloudflare and Major bots for Python Website Screenshot using Selenium Stealth

While I was doing my research I found this project on GitHub Known as Selenium stealth

  •  It will even pass all the public bot tests
  •  It can perform Google account logins if required
  •  It can even help with maintaining a normal recaptcha V3 score

So, essentially it would look like a normal person is browsing the website.

So how to use it and integrate into our python script?

Simply install it by ‘pip install selenium-stealth’ in the terminal

And it should install in a moment

After a few tweaks to the previous code. Here are the changes.

WebDriver is normal, We add headless, and Some arguments.

#
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument("--headless")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(
    options=options, executable_path="E:\TechVideos\Python\chromedriver.exe")
#

And Most Importantly, the stealth function. Arguments like renderer, platform, languages, and vendor, make it seem like a real person is browsing the website.

#
stealth(driver,
        languages=["en-US", "en"],
        vendor="Google Inc.",
        platform="Win32",
        webgl_vendor="Intel Inc.",
        renderer="Intel Iris OpenGL Engine",
        fix_hairline=True,
        )
#

Source Code for Python Screenshot of a webpage using selenium-stealth

#start
from optparse import OptParseError
from selenium import webdriver
from selenium.webdriver.chrome.options import Options
import time
from selenium_stealth import stealth
 
start_time = time.time()
 
 
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
options.add_argument("--headless")
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(
    options=options, executable_path="E:\TechVideos\Python\chromedriver.exe")
 
stealth(driver,
        languages=["en-US", "en"],
        vendor="Google Inc.",
        platform="Win32",
        webgl_vendor="Intel Inc.",
        renderer="Intel Iris OpenGL Engine",
        fix_hairline=True,
        )
 
 
url = "https://www.neilpatel.com"
 
#Now for opening the URL
 
driver.get(url)
 
#After Opening we’ll take the screenshot with the following method.
 
#Remember to pass your name, with which you want your screenshot to be saved, in the arguments of the following method.
 
 
driver.save_screenshot("test.png") # i want my screenshot to be saved as test.png
 
#Then we want to calculate the time elapsed for the overall process of taking the screenshot.
 
elapsed = "%s seconds" % (time.time() - start_time) # the time now minus the start time
 
#Now let’s print the elapsed time on the screen
 
print("Done in" + elapsed)
 
#end

Output of ScreenShot with selenium_stealth

Here we are!!!

Now you will look like a normal person browsing the web. 

Let's test it again, and it works perfectly fine.✌

Conclusion

So there you Go! Now you have the power to perform the most basic Web Automation Task(Screenshot a webpage). All thanks to Python. 

I hope you learned a lot. And I urge you to further explore selenium if you’re really into web automation. 

Also, check out other helpful python guides on the blog.

If you have any queries, you can post them in the comments below. And we’ll be happy to help.

Take care and Happy Coding!


Spread the love