Skip to content

seo(robots): disallow tag-filtered blog URLs#4247

Merged
emir-karabeg merged 1 commit intostagingfrom
robots-blog-tag
Apr 21, 2026
Merged

seo(robots): disallow tag-filtered blog URLs#4247
emir-karabeg merged 1 commit intostagingfrom
robots-blog-tag

Conversation

@emir-karabeg
Copy link
Copy Markdown
Collaborator

Summary

Adds Disallow: /blog*tag= to the generated robots.txt to prevent crawlers from indexing tag-filtered blog views (both /blog?tag=X and paginated /blog?page=N&tag=X). Per SEO team request to reduce duplicate content surfaced from tag filters.

Type of Change

  • Other: SEO / robots.txt

Testing

Visit /robots.txt after deploy and confirm the new Disallow: /blog*tag= line is present.

Checklist

  • Code follows project style guidelines
  • Self-reviewed my changes
  • Tests added/updated and passing
  • No new warnings introduced
  • I confirm that I have read and agree to the terms outlined in the Contributor License Agreement (CLA)

@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 21, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment
Project Deployment Actions Updated (UTC)
docs Skipped Skipped Apr 21, 2026 10:08pm

Request Review

@cursor
Copy link
Copy Markdown

cursor Bot commented Apr 21, 2026

PR Summary

Low Risk
Low risk SEO-only change that only adjusts robots.txt crawling rules; it could affect blog discoverability if the pattern is overly broad.

Overview
Updates the generated robots.txt to disallow crawling of tag-filtered blog URLs by adding '/blog*tag=' to the disallow list, preventing indexing of /blog?tag=... (including paginated variants).

Reviewed by Cursor Bugbot for commit eb3281b. Configure here.

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Apr 21, 2026

Greptile Summary

Adds Disallow: /blog*tag= to the generated robots.txt via Next.js MetadataRoute.Robots, preventing crawlers from indexing tag-filtered blog pages (e.g. /blog?tag=X, /blog?page=N&tag=X). The wildcard pattern is a widely-supported Google/Bing extension and Next.js renders it verbatim in the output file.

Confidence Score: 5/5

Safe to merge — single-line addition to an SEO config file with no logic changes.

The change is minimal and correct: the wildcard pattern /blog*tag= is a widely-supported robots.txt extension that matches all tag-filtered blog URLs as intended. Next.js renders disallow strings verbatim so the * is preserved. No logic, auth, or data paths are affected.

No files require special attention.

Important Files Changed

Filename Overview
apps/sim/app/robots.ts Adds /blog*tag= to the disallow list to prevent crawlers from indexing tag-filtered blog URLs; single-line, low-risk SEO change.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Crawler requests /robots.txt] --> B[Next.js robots function]
    B --> C{URL pattern match}
    C -->|matches /blog*tag=| D[Disallowed - not crawled]
    C -->|matches /api/ or /workspace/ etc| D
    C -->|no disallow match| E[Allowed - crawled and indexed]
    D --> F[Crawler skips URL]
    E --> G[Crawler indexes page]
Loading

Reviews (1): Last reviewed commit: "seo(robots): disallow tag-filtered blog ..." | Re-trigger Greptile

@emir-karabeg emir-karabeg merged commit 3a0e7b8 into staging Apr 21, 2026
10 checks passed
@emir-karabeg emir-karabeg deleted the robots-blog-tag branch April 21, 2026 22:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant