Content Type
Static (Markdown)
Trigger Keywords
scrapewebextracthtmlparsecrawldata extractionwebsite
Required Packages
beautifulsoup4requestslxmlpandas
Instructions Preview
# Web Scraping Skill
Extract structured data from websites using Python. Use the helper script or write custom code.
## Using the Helper Script
```bash
# Scrape a single page and extract all links
python /home/daytona/skills/web-scraping/scraper.py https://example.com --links
# Extract text content
python /home/daytona/skills/web-scraping/scraper.py https://example.com --text
# Extract data using CSS selectors
python /home/daytona/skills/web-scraping/scraper.py https://example.com --selector "h1,h2,h3" --output /home/daytona/out/headings.json
# Extract table data to CSV
python /home/daytona/skills/web-scraping/scraper.py https://example.com --tables --output /home/daytona/out/tables.csv
# Full page analysis
python /home/daytona/skills/web-scraping/scraper.py https://example.com --analyze
```
## Custom Scraping Code
### Basic Page Fetching
```python
import requests
from bs4 import BeautifulSoup
def fetch_page(url):
headers = {
'User-Agent': 'Mozilla/5.0 (compatibl...