Introduction to SEO Research with Python

Discover effective SEO Research with Python and optimization, no premium tools required! πŸ’°πŸ€‘

Why SEO Research with Python?

SEO Research with Python is an incredibly versatile method and utilises a powerful programming language that offers numerous benefits for SEO professionals:

  • Open-source and freely available
  • Extensive ecosystem of libraries for data analysis, web scraping, and automation
  • Easy to learn and write, even for beginners
  • Integrates well with various APIs and data sources

Target Audience

This guide is tailored for TheITApprentice volunteers who have a basic understanding of Python and want to leverage it for SEO research and optimization. No prior SEO experience is necessary!

UIsing Python and open-source tools to conduct effective SEO research and optimization

Essential Python Tools for SEO

Python Setup

Before we dive into the SEO techniques, let’s set up our Python environment:

  1. Install Python from the official website:
  2. Create a virtual environment to keep your projects isolated:
python -m venv myenv

Key Libraries

We’ll be using the following Python libraries for our SEO tasks:

  • requests: For making HTTP requests and retrieving web pages
  • BeautifulSoup: For parsing and extracting data from HTML and XML documents
  • pandas: For data manipulation and analysis
  • pytrends: For accessing Google Trends data
  • Google APIs: For integrating with various Google services like Analytics and Search Console

Install these libraries using pip:

pip install requests beautifulsoup4 pandas pytrends google-api-python-client

Python-Powered SEO Techniques

1. Keyword Research Using Python

Discover trending keywords with pytrends:

from pytrends.request import TrendReq

pytrends = TrendReq(hl='en-US', tz=360)

keywords = ['IT apprenticeship', 'tech apprenticeship', 'IT career', 'IT training', 'technology skills']
pytrends.build_payload(kw_list=keywords, timeframe='today 12-m')

interest_over_time = pytrends.interest_over_time()

2. Competitor Website Analysis

Scrape competitor websites with BeautifulSoup and requests:

import requests
from bs4 import BeautifulSoup

url = ''
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

# Extract meta tags
meta_tags = soup.find_all('meta')
for tag in meta_tags:
    print(tag.get('name'), tag.get('content'))

# Extract headings
headings = soup.find_all(['h1', 'h2', 'h3'])
for heading in headings:
    print(, heading.text.strip())

3. Content Optimization Analysis

Analyze text content using the nltk library:

import nltk
from nltk.corpus import stopwords
from nltk.tokenize import word_tokenize'punkt')'stopwords')

text = "Your content goes here..."

# Tokenize and remove stop words
tokens = word_tokenize(text)
stop_words = set(stopwords.words('english'))
filtered_tokens = [word for word in tokens if word.casefold() not in stop_words]

# Calculate keyword density
keyword_density = {}
for word in filtered_tokens:
    if word not in keyword_density:
        keyword_density[word] = 1
        keyword_density[word] += 1

total_words = len(filtered_tokens)
for word, count in keyword_density.items():
    keyword_density[word] = (count / total_words) * 100


4. SEO Audit with Python

Check site health with requests:

import requests

urls = ['', '', '']

for url in urls:
    response = requests.get(url)
    status_code = response.status_code
    if status_code == 200:
        print(f"{url} is accessible (status code: {status_code})")
        print(f"{url} is not accessible (status code: {status_code})")

Find backlinks with a custom web scraper:

import requests
from bs4 import BeautifulSoup

url = ''
response = requests.get(url)
soup = BeautifulSoup(response.text, 'html.parser')

backlinks = []
for link in soup.find_all('a'):
    href = link.get('href')
    if href and '' not in href:


6. Tracking SEO Performance

Fetch data from Google Analytics API:

from google.oauth2 import service_account
from googleapiclient.discovery import build

SERVICE_ACCOUNT_FILE = 'path/to/your/service_account.json'
VIEW_ID = 'your_view_id'

credentials = service_account.Credentials.from_service_account_file(SERVICE_ACCOUNT_FILE)
analytics = build('analyticsreporting', 'v4', credentials=credentials)

response = analytics.reports().batchGet(
        'reportRequests': [
                'viewId': VIEW_ID,
                'dateRanges': [{'startDate': '7daysAgo', 'endDate': 'today'}],
                'metrics': [{'expression': 'ga:sessions'}],
                'dimensions': [{'name': 'ga:pagePath'}]

for report in response.get('reports', []):
    for row in report.get('data', {}).get('rows', []):
        print(row['dimensions'][0], row['metrics'][0]['values'][0])

Congradulations for learning the basics of SEO Research with Python

We hope you now enjoy effective SEO research and optimization, no premium tools required!

