Cannibalization Analyzer

Detect competing URLs with Quick metrics or Deep AI content similarity

Data Input & Analysis Mode

Choose how to provide data
Choose analysis type
CSV must contain: query, page, clicks, impressions

Drop your GSC CSV file here or click to browse

Must include: Query, Page, Clicks, Impressions, CTR, Position

Quick Mode Settings

Minimum number of URLs per query to flag as cannibalization
Maximum % of clicks for top URL (lower = more dispersed = more cannibalization)
Minimum impressions per query to analyze
Minimum clicks per query to analyze

Analysis Modes

  • Quick: Fast metrics-based detection using GSC data
  • Deep: AI content similarity using OpenAI embeddings

Pro Tip

Start with Quick Mode for initial audit, then use Deep Mode to analyze specific problem areas with AI content similarity.

Cannibalization Analyzer Tutorial

What This Tool Does:

The Cannibalization Analyzer detects when multiple URLs on your site compete for the same keyword by:

  • Quick Mode (Metrics-Based): Fast detection using GSC click patterns and ranking overlaps
  • Deep Mode (AI Content): Advanced semantic similarity analysis using OpenAI embeddings to find topically overlapping pages
  • Action Planning: Generates prioritized recommendations for consolidating or differentiating competing URLs
  • GSC Integration: Direct OAuth connection to pull real-time data without manual CSV exports

How to Use:

  1. Choose Input Mode: Select CSV upload or OAuth (GSC API)
  2. Select Analysis Mode: Quick for fast metrics-based detection, Deep for AI content similarity
  3. Upload Data or Connect:
    • CSV: Upload GSC export with Query, Page, Clicks, Impressions, CTR, Position columns
    • OAuth: Click "Connect to GSC", select property, choose date range
  4. Configure Thresholds: Adjust detection sensitivity in sidebar settings
  5. Run Analysis: Click "Analyze Cannibalization"
  6. Review Results: Check stats, cannibalization instances, and action plan
  7. Download Reports: Export full data for implementation tracking

Analysis Modes Compared:

Feature Quick Mode Deep Mode
Detection Method GSC metrics (clicks, impressions, rankings) AI content similarity (OpenAI embeddings)
Speed Very fast (<1 minute) Slower (depends on URL count)
Cost Free ~$0.01-$0.10 per 100 URLs (OpenAI API)
Accuracy Good for obvious cases Excellent for subtle content overlaps
Best For Initial audits, large sites, budget-conscious Deep analysis, content strategy, AI-powered insights
Requirements GSC data only GSC data + OpenAI API key

Quick Mode Thresholds:

  • URL Overlap Threshold (2-5): Minimum number of competing URLs to flag a query. Lower = more sensitive.
  • Click Concentration (40-70%): Max % of clicks one URL can have before it's considered dominant (not cannibalized). Higher = stricter detection.
  • Min Impressions (50-200): Minimum impressions to consider a query significant enough to analyze.
  • Min Clicks (5-20): Minimum clicks to filter out low-traffic noise.

Deep Mode Settings:

  • Similarity Threshold (0.80-0.95): Minimum cosine similarity score to flag pages as similar. Higher = stricter (only very similar pages).
  • Max URLs (50-500): Maximum number of URLs to analyze (to control API costs).
  • OpenAI API Key: Required for embeddings generation. Get key at platform.openai.com/api-keys.

Common Cannibalization Fixes:

Issue Type Recommended Action Implementation
Duplicate/near-duplicate content Consolidate pages with 301 redirect Merge content into strongest URL, redirect others
Similar but distinct content Differentiate with unique angles Rewrite to target different subtopics or intents
Internal linking issues Canonicalize primary URL Update internal links to consistently point to main URL
Product/category overlap Use canonical tags Set rel="canonical" from similar pages to primary version
Blog post series overlap Create hub/pillar page Build comprehensive pillar page, link individual posts as spokes

Best Practices:

  • Start with Quick Mode: Run initial audit to identify obvious issues before investing in Deep Mode
  • Use 3-6 Month Date Range: Sufficient data to detect patterns without seasonal noise
  • Filter Brand Queries: Exclude branded terms that intentionally have multiple landing pages (products, categories)
  • Prioritize High-Traffic Queries: Focus on queries with significant impressions/clicks for maximum impact
  • Test Before Consolidating: Use Google Search Console's URL inspection tool to verify canonical relationships
  • Monitor After Changes: Re-run analysis monthly to track improvements and catch new cannibalization

CSV Format Requirements (CSV Mode):

Your GSC export must include these columns (case-insensitive):

  • Query - The search keyword
  • Page - The landing page URL
  • Clicks - Number of clicks
  • Impressions - Number of impressions
  • CTR - Click-through rate (percentage)
  • Position - Average ranking position