apache:use_.htaccess_to_hard-block_spiders_and_crawlers
Differences
This shows you the differences between two versions of the page.
Next revision | Previous revision | ||
apache:use_.htaccess_to_hard-block_spiders_and_crawlers [2016/10/09 12:52] – created peter | apache:use_.htaccess_to_hard-block_spiders_and_crawlers [2023/07/17 11:20] (current) – removed peter | ||
---|---|---|---|
Line 1: | Line 1: | ||
- | ====== Apache - Use .htaccess to hard-block spiders and crawlers ====== | ||
- | |||
- | The .htaccess is a (hidden) file which can be found in any directory. | ||
- | |||
- | <color red> | ||
- | |||
- | One of the things you can do, with **.htaccess**, | ||
- | |||
- | This blocks excessively active crawlers/ | ||
- | |||
- | Add the following lines to a website' | ||
- | |||
- | <file bash .htaccess> | ||
- | #redirect bad bots to one page | ||
- | RewriteEngine on | ||
- | RewriteCond %{HTTP_USER_AGENT} facebookexternalhit [NC, | ||
- | RewriteCond %{HTTP_USER_AGENT} Twitterbot [NC,OR] | ||
- | RewriteCond %{HTTP_USER_AGENT} Baiduspider [NC,OR] | ||
- | RewriteCond %{HTTP_USER_AGENT} MetaURI [NC,OR] | ||
- | RewriteCond %{HTTP_USER_AGENT} mediawords [NC,OR] | ||
- | RewriteCond %{HTTP_USER_AGENT} FlipboardProxy [NC] | ||
- | RewriteCond %{REQUEST_URI} !\/ | ||
- | RewriteRule .* http:// | ||
- | </ | ||
- | |||
apache/use_.htaccess_to_hard-block_spiders_and_crawlers.1476017554.txt.gz · Last modified: 2020/07/15 09:30 (external edit)