SEO Research with Python

Discover effective SEO Research with Python and optimization, no premium tools required! πŸ’°πŸ€‘

Why SEO Research with Python?

SEO Research with Python is an incredibly versatile method and utilises a powerful programming language that offers numerous benefits for SEO professionals:

  • Open-source and freely available
  • Extensive ecosystem of libraries for data analysis, web scraping, and automation
  • Easy to learn and write, even for beginners
  • Integrates well with various APIs and data sources

Target Audience

This guide is tailored for TheITApprentice volunteers who have a basic understanding of Python and want to leverage it for SEO research and optimization. No prior SEO experience is necessary!

Essential Python Tools for SEO

Python Setup

Before we dive into the SEO techniques, let’s set up our Python environment:

  1. Install Python from the official website:
  2. Create a virtual environment to keep your projects isolated:
python -m venv myenv

Key Libraries

We’ll be using the following Python libraries for our SEO tasks:

  • requests: For making HTTP requests and retrieving web pages
  • BeautifulSoup: For parsing and extracting data from HTML and XML documents
  • pandas: For data manipulation and analysis
  • pytrends: For accessing Google Trends data
  • Google APIs: For integrating with various Google services like Analytics and Search Console

Install these libraries using pip:

pip install requests beautifulsoup4 pandas pytrends google-api-python-client

Python-Powered SEO Techniques

1. Keyword Research Using Python

Discover trending keywords with pytrends:

from pytrends.request import TrendReq

pytrends = TrendReq(hl='en-US', tz=360)

keywords = ['IT apprenticeship', 'tech apprenticeship', 'IT career', 'IT training', 'technology skills']
pytrends.build_payload(kw_list=keywords, timeframe='today 12-m')

interest_over_time = pytrends.interest_over_time()

2. Competitor Website Analysis

Scrape competitor websites with BeautifulSoup and requests:

import requests
from bs4 import BeautifulSoup

url = ''
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

# Extract meta tags
meta_tags = soup.find_all('meta')
for tag in meta_tags:
    print(tag.get('name'), tag.get('content'))

# Extract headings
headings = soup.find_all(['h1', 'h2', 'h3'])
for heading in headings:
    print(, heading.text.strip())

3. Content Optimization Analysis

Analyze text content using the nltk library:

import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize'punkt')'stopwords')

text = "Your content goes here..."

# Tokenize and remove stop words
tokens = word_tokenize(text)
stop_words = set(stopwords.words('english'))
filtered_tokens = [word for word in tokens if word.casefold() not in stop_words]

# Calculate keyword density
keyword_density = {}
for word in filtered_tokens:
    if word not in keyword_density:
        keyword_density[word] = 1
        keyword_density[word] += 1

total_words = len(filtered_tokens)
for word, count in keyword_density.items():
    keyword_density[word] = (count / total_words) * 100


4. SEO Audit with Python

Check site health with requests:

import requests

urls = ['', '', '']

for url in urls:
    response = requests.get(url)
    status_code = response.status_code
    if status_code == 200:
        print(f"{url} is accessible (status code: {status_code})")
        print(f"{url} is not accessible (status code: {status_code})")

Find backlinks with a custom web scraper:

import requests
from bs4 import BeautifulSoup

url = ''
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

backlinks = []
for link in soup.find_all('a'):
    href = link.get('href')
    if href and '' not in href:


6. Tracking SEO Performance

Fetch data from Google Analytics API:

from google.oauth2 import service_account
from googleapiclient.discovery import build

SERVICE_ACCOUNT_FILE = 'path/to/your/service_account.json'
VIEW_ID = 'your_view_id'

credentials = service_account.Credentials.from_service_account_file(SERVICE_ACCOUNT_FILE)
analytics = build('analyticsreporting', 'v4', credentials=credentials)

response = analytics.reports().batchGet(
        'reportRequests': [
                'viewId': VIEW_ID,
                'dateRanges': [{'startDate': '7daysAgo', 'endDate': 'today'}],
                'metrics': [{'expression': 'ga:sessions'}],
                'dimensions': [{'name': 'ga:pagePath'}]

for report in response.get('reports', []):
    for row in report.get('data', {}).get('rows', []):
        print(row['dimensions'][0], row['metrics'][0]['values'][0])

Congradulations for learning the basics of SEO Research with Python

We hope you now enjoy effective SEO research and optimization, no premium tools required!

