Back to Prompts
Package 4 prompts

Web Scraping Prompts

AI prompts for web scraping from the LearnWithHasan AI Coding Building Blocks (Data).

#1 Coding Assistant

Create a Basic Web Scraper

Start here - simplest scraping pattern From the Web Scraping AI Coding Building Block.

Prompt
Create a simple web scraper that collects data from a website.

Language: [Python, JavaScript, etc.]
Library: [BeautifulSoup, Cheerio, Puppeteer, or suggest one]

I want to scrape: [describe the website and what data you need, e.g., "product prices from an e-commerce category page"]

Requirements:
1. Load the webpage
2. Find the elements containing [product name, price, rating, etc.]
3. Extract the text/values from those elements
4. Save the results to a simple format (list, CSV, or JSON)

Keep it simple. I want to understand the basic pattern first. Show me:
- How to fetch the page
- How to find elements using CSS selectors
- How to extract the text I want
- How to handle multiple items on one page

I'm learning, so explain each step simply. What is a CSS selector and how do I find one?
#2 Coding Assistant

Scrape Multiple Pages

For scraping across many pages From the Web Scraping AI Coding Building Block.

Prompt
Extend my scraper to collect data from multiple pages (pagination).

Language: [Python, JavaScript, etc.]
Current code: [paste your basic scraper or describe it]

The website has: [pagination like "page 1, 2, 3..." OR "next page" buttons OR infinite scroll]

Requirements:
1. Start from page 1 and collect the data
2. Find the link to the next page
3. Repeat until there are no more pages (or stop after [X] pages)
4. Combine all results into one file
5. Add a small delay between requests (don't overwhelm the server)

Also show me:
- How to detect when I've reached the last page
- How to handle pages that fail to load
- How to save progress so I can resume if interrupted

I'm learning, so explain the pagination patterns and why delays matter.
#3 Coding Assistant

Build a Robust Scraper with Fallbacks

For handling real-world scraping challenges From the Web Scraping AI Coding Building Block.

Prompt
Make my web scraper more reliable and handle common problems.

Language: [Python, JavaScript, etc.]
Current code: [paste your scraper or describe it]

Problems I want to handle:
1. Websites that block my requests (403 errors)
2. Elements that sometimes don't exist on a page
3. Rate limiting (too many requests too fast)
4. Network timeouts and connection errors
5. Data that needs cleaning (extra whitespace, weird characters)

Requirements:
1. Add retry logic for failed requests (try 3 times before giving up)
2. Use realistic browser headers so I don't look like a bot
3. Add random delays between requests (1-3 seconds)
4. Gracefully skip items with missing data instead of crashing
5. Log what worked and what failed for debugging

Optional: Add proxy support for when my IP gets blocked.

I'm learning, so explain why sites block scrapers and how these techniques help.
#4 Coding Assistant

Explain My Scraping Code

Understand existing scraping code From the Web Scraping AI Coding Building Block.

Prompt
I have some web scraping code but I don't fully understand what it's doing. Please explain it to me.

Here's my scraping code:
[paste your scraping code here]

Please explain:
1. What website/data is this scraper targeting?
2. Walk through it line by line. What happens at each step?
3. How does it find the data on the page (what selectors)?
4. What happens if the page structure changes?
5. Are there any risks or improvements I should consider?

Also check for:
- Missing error handling
- No delays (might get blocked)
- Hardcoded values that should be configurable
- Missing data validation

I'm learning, so explain like I'm new to web scraping.
76 views