About Rejecting Particular Bots

The robots.txt file relies on the goodwill of the bots. If you have some undesirable bots still accessing your website, you can block their requests based on the User Agent. This is how to do that in nginx configuration:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
map $http_user_agent $is_undesirable_bot {
    default 0;
    ~evil-spider 1;
    ~evil-crawler 1;
    ~evil-bot 1;
}

server {
    # ...
    if ($is_undesirable_bot) {
        return 403;
    }
}

And here's how you can block requests from undesirable bots in .htaccess file with Apache:

1
2
3
4
5
6
7
RewriteEngine On
RewriteBase /

RewriteCond %{HTTP_USER_AGENT} (evil-spider|evil-crawler|evil-bot) [NC]
RewriteRule (.*) - [F,L]

# ...

Tips and Tricks Dev Ops Development nginx Apache