# Regexp to catch known spambots and software alike. The %(apache_access_log)s variable comes from /etc/fail2ban/nf and is defined as /var/log/apache2/*access.log.įor reference, here is the nf that I generated (without modifications). # Ban hosts which agent identifies spammer robots crawling the web If you want to use it replace /etc/fail2ban/filter.d/nf with it.įor reference, this is the definition of apache-badbots from /etc/fail2ban/nf. The new file will be available at config/filter.d/nf. If you're curious, you can generate a new one with following. I generated a new one from the fail2ban git repository, but it still didn't include those bots (maybe the source is outdated or incomplete).
# Generated on Thu Nov 7 14:23: by files/gen_badbots. In particular there's this comment: # DEV Notes:
#DOTBOT SUPREME TWITTER UPDATE#
Looking at /etc/fail2ban/filter.d/nf on an update to date Ubuntu 16.04 server I have, it looks outdated. This information may be helpful for reference. *): failregex = ^ -.*"(GET|POST).*HTTP.*".*(?:%(badbots)s|%(badbotscustom)s).*"$įinally, reload the fail2ban configurations. Next, you'll want to modify the failregex line so that the regular expression matches any part of the user agent, not just the whole thing. badbotscustom = EmailCollector|WebEMailExtrac|TrackBack/1\.02|sogou music spider|BLEXBot|ltx71|DotBot|Barkrowlerīadbotscustom = BLEXBot|ltx71|DotBot|Barkrowler
Either replace those or tack yours on the end separated with |s. The bots are specified using a regular expression. In it there is a particular line for custom bots: badbotscustom = EmailCollector|WebEMailExtrac|TrackBack/1\.02|sogou music spider Next, modify the apache-badbots filter to include your bots. The main portion of the apache-badbots jail is defined in /etc/fail2ban/nf so all you have to do is enable it. Create the file /etc/fail2ban/jail.d/apache-badbots.local with the contents: So, you need to enable the apache-badbots jail that reads the Apache access log if you haven't already. Blocking by user-agent will ultimately be a cat and mouse game, but if you want to do it you want the following. But your comments indicate they're ignoring that directive. The correct way to deal with annoying bots is to block them in "robots.txt".