Jun 10, 2018 • SysadminEditsPermalink

Fighting Mailman Subscription Spam: Leveling Up

Last week, I blogged about my efforts to fight mailman subscription spam. Enabling SUBSCRIBE_FORM_SECRET as described there indeed helped to drastically reduce the amount of subscription spam from more than 1000 to less than 10 mails sent per day, but some attackers still got through. My guess is that those machines were just so slow that they managed to wait the required five seconds before submitting the form.

So, clearly I had to level up my game. I decided to pull through on my plan to write a simple CAPTCHA for mailman (that doesn’t expose your users to Google). This post describes how to configure and install that CAPTCHA.

CAPTCHA configuration

This simple CAPTCHA is based on a list of questions and matching answers that you, the site admin, define. The idea is to use questions that anyone who is interested in your site can easily answer. Since most sites are small enough that they are not to be targeted specifically, the bots (hopefully) will not be able to answer these questions. At least for my sites, that has worked so far (I am running with this patch for a week now).

The CAPTCHA requires SUBSCRIBE_FORM_SECRET to be enabled. Configuration can look something like this:

SUBSCRIBE_FORM_SECRET = "<some random string, generated e.g. by [openssl rand -base64 18]>"
CAPTCHAS = [
    # This is a list of questions, each paired with a list of answers.
    ('What is two times six?', ['12', 'twelve']),
    ('What is the name of this site's blog', ['Ralf's Ramblings']),
]

CAPTCHA patch

Right now, the CAPTCHAS part of the configuration will not yet do anything because you still have to install my patch. The patch is losely based on this blog post and was written against Mailman 2.1.23 on Debian 9 “Stretch”. If you are using a different version you may have to adapt it accordingly.

First of all, create a new file /usr/lib/mailman/Mailman/Captcha.py with the following content:

import random
from Mailman import Utils

def display(mlist, captchas):
    """Returns a CAPTCHA question, the HTML for the answer box, and
    the data to be put into the CSRF token"""
    idx = random.randrange(len(captchas))
    question = captchas[idx][0]
    box_html = mlist.FormatBox('captcha_answer', size=30)
    return (Utils.websafe(question), box_html, str(idx))

def verify(idx, given_answer, captchas):
    try:
        idx = int(idx)
    except ValueError:
        return False
    if not idx in range(len(captchas)):
        return False
    # Chec the given answer
    correct_answers = captchas[idx][1]
    given_answer = given_answer.strip().lower()
    return given_answer in map(lambda x: x.strip().lower(), correct_answers)

This contains the actual CAPTCHA logic. Now it needs to be wired up with the listinfo page (where the subscription form is shown to the user) and the subscription page (where the subscription form is submitted to).

Here is the patch for /usr/lib/mailman/Mailman/Cgi/listinfo.py:

--- listinfo.py.orig	2018-06-03 19:18:30.089902948 +0200
+++ listinfo.py	2018-06-10 19:12:59.381910750 +0200
@@ -26,6 +26,7 @@
 
 from Mailman import mm_cfg
 from Mailman import Utils
+from Mailman import Captcha
 from Mailman import MailList
 from Mailman import Errors
 from Mailman import i18n
@@ -216,10 +220,16 @@
             #        drop one : resulting in an invalid format, but it's only
             #        for our hash so it doesn't matter.
             remote = remote.rsplit(':', 1)[0]
+        # get CAPTCHA data
+        (captcha_question, captcha_box, captcha_idx) = Captcha.display(mlist, mm_cfg.CAPTCHAS)
+        replacements['<mm-captcha-question>'] = captcha_question
+        replacements['<mm-captcha-box>'] = captcha_box
+        # fill form
         replacements['<mm-subscribe-form-start>'] += (
-                '<input type="hidden" name="sub_form_token" value="%s:%s">\n'
-                % (now, Utils.sha_new(mm_cfg.SUBSCRIBE_FORM_SECRET +
+                '<input type="hidden" name="sub_form_token" value="%s:%s:%s">\n'
+                % (now, captcha_idx, Utils.sha_new(mm_cfg.SUBSCRIBE_FORM_SECRET +
                           now +
+                          captcha_idx +
                           mlist.internal_name() +
                           remote
                           ).hexdigest()

And here the patch for /usr/lib/mailman/Mailman/Cgi/subscribe.py:

--- subscribe.py.orig	2018-06-03 19:18:35.761813517 +0200
+++ subscribe.py	2018-06-03 20:35:00.056454989 +0200
@@ -25,6 +25,7 @@
 
 from Mailman import mm_cfg
 from Mailman import Utils
+from Mailman import Captcha
 from Mailman import MailList
 from Mailman import Errors
 from Mailman import i18n
@@ -144,13 +147,14 @@
             #        for our hash so it doesn't matter.
             remote1 = remote.rsplit(':', 1)[0]
         try:
-            ftime, fhash = cgidata.getvalue('sub_form_token', '').split(':')
+            ftime, fcaptcha_idx, fhash = cgidata.getvalue('sub_form_token', '').split(':')
             then = int(ftime)
         except ValueError:
-            ftime = fhash = ''
+            ftime = fcaptcha_idx = fhash = ''
             then = 0
         token = Utils.sha_new(mm_cfg.SUBSCRIBE_FORM_SECRET +
                               ftime +
+                              fcaptcha_idx +
                               mlist.internal_name() +
                               remote1).hexdigest()
         if ftime and now - then > mm_cfg.FORM_LIFETIME:
@@ -165,6 +169,10 @@
             results.append(
     _('There was no hidden token in your submission or it was corrupted.'))
             results.append(_('You must GET the form before submitting it.'))
+        # Check captcha
+        captcha_answer = cgidata.getvalue('captcha_answer', '')
+        if not Captcha.verify(fcaptcha_idx, captcha_answer, mm_cfg.CAPTCHAS):
+            results.append(_('This was not the right answer to the CAPTCHA question.'))
     # Was an attempt made to subscribe the list to itself?
     if email == mlist.GetListEmail():
         syslog('mischief', 'Attempt to self subscribe %s: %s', email, remote)

Finally, the HTML template for the listinfo page needs to be updated to show the CAPTCHA question and answer box. On Debian, the templates for enabled languages are located in /etc/mailman/<lang>. The patch for the English template looks as follows:

--- /usr/share/mailman/en/listinfo.html	2018-02-08 07:54:28.000000000 +0100
+++ listinfo.html	2018-06-03 20:35:10.680275026 +0200
@@ -116,6 +116,12 @@
       </tr>
       <mm-digest-question-end>
       <tr>
+        <TD BGCOLOR="#dddddd">Please answer the following question to prove that you are not a bot:
+          <mm-captcha-question>
+        </TD>
+        <TD><mm-captcha-box></TD>
+      </tr>
+      <tr>
 	<td colspan="3">
 	  <center><MM-Subscribe-Button></center>
     </td>

If you have other languages enabled, you have to translate this patch accordingly.

That’s it! Now bots have to be adapted to your specific questions to be able to successfully subscribe someone. It is still a good idea to monitor the logs (/var/log/mailman/subscribe on Debian) to see if any illegitimate requests still make it through, but unless you site is really big I’d be rather surprised to see bots being able to answer site-specific questions.

Update: With Mailman 2.1.30, this patch is now included upstream. The CAPTCHAS format is slightly different than above to support multiple languages; consult the Mailman documentation for further details. /Update

Posted on Ralf's Ramblings on Jun 10, 2018.
Comments? Drop me a mail!