How to use wildcard in robots.txt

By default Disallow entry in robots.txt is based on url prefix. To block every url beginning with /foo/ the following robots.txt can be used.

User-agent: *
Disallow: /foo/

One can also use wildcard in robots.txt for more flixible rules. Here are some robots.txt examples using wild card in url.

Using wildcard in robots.txt

To support more complex Disallow rules, we can use wildcard in between. So to disallow every url which has /foo/ in url anywhere, the following robots.txt can be used:

User-agent: *
Disallow: /*/foo/

Block 2nd and higher tag pages in wordpress

To block 2nd and higher tag pages in wordpress one can use the following robots.txt

User-agent: *
Disallow: /tag/*/page/

Block comment feed urls in wordpress

User-agent: *
Disallow: /*/feed/

Note this will not block /feed/

Share this article: share on Google+ share on facebook share on linkedin tweet this submit to reddit

Comments

Click here to write/view comments