Naxsi is pretty stupid, and this is an something you can take advantage of, as it makes it hacker-friendly. The big deal, with naxsi as with any other positive-approach WAF, is training.
As explained earlier, you can trick naxsi to generate specific whitelist while in learning mode. For example, if you add to the location you want to train a rule like this :
BasicRule id:0 "str:123FREETEXT" "mz:ARGS|BODY|URL" s:BLOCK;
Then, whenever you enter the string “123FREETEXT” in a form field, naxsi will generate a whitelist for id:0 (which means “allow everything”). Let’s say you have a website wich allow users to publish comments, and you want users to be able to put anything in comments (because you trust your code). Then, during learning phase, input 123FREETEXT in the user comment field.
Naxsi will thus generate a whitelist like :
BasicRule wl:0 "mz:$BODY_VAR:comment|$URL:/comment-article.php"
Once this whitelist is used, naxsi will allow argument “comment” of POST requests to the url “/comment-article.php” to contain anything without bocking it.
Fine, but in some case, you might not want to allow anything, but allow a restricted subset of uncommon characters.
In this same comment page, users can as well input their postal address, to get free gifts (why not!). In this “address” field, you don’t want to allow “any” character, as it’s only a postal address. No need to allow “<” or “>” for example. But, you want to allow users to use this subset of characters, as they are legitimate in a postal address :
' ; , . # ( )
But having to type them every time is quite annoying, and might lead to human mistakes. Instead, you could just add a new rule to your to-be-trained location :
BasicRule id:4000 str:123address s:BLOCK
Thus inputting, 123address in this very same form will make naxsi’s learning mode generate a whitelist like this :
BasicRule wl:4000 "mz:$BODY_VAR:address|$URL:/comment-article.php"
Not very useful right ? Actually, totally useless, because rule 4000 doesn’t exist in naxsi core rule set. But, if, after you finished your training you do a little magical sed like this (on the output of learning mode daemon) :
cat /tmp/rules.tmp | sed "s/wl:4000/wl:1008,1010,1011,1013,1015,1016/g" > /etc/nginx/naxsi-mysite.rules
Then it makes much more sense, as we just told naxsi to allow :
Better, but still not that good.
Now, let’s see training from a different point of view. Within this same page, if, by chance (or because it makes sense) your “address” fields in your forms are named in a predictable way, like “address_foobar”, you can make training MUCH easier.
As naxsi is aware of its own dumbness, it will parse argument names in the same way as argument content (mainly to avoid bypass because of parser bugs – both in nginx/naxsi or in the backend webserver). So in this specific case, you can simply setup a rule like :
BasicRule wl:4000 str:address s:BLOCK;
Then, whenever you perform a post on the form, regardless of the variable content, this rule will be triggered if the POST variable (name or content) contains the string “address”, and naxsi will generate a whitelist like :
BasicRule wl:4000 "mz:$ARGS_VAR:address_foobar|$URL:/comment-article.php|NAME";
The “|NAME” at the end of the whitelist is quite important, as it means that this whitelist is true, but only for the variable name, and not its content. Here, we will apply the same transformation as before (replace wl:4000 with IDs 1008,1010,1011,1013,1015,1016), and remove the |NAME in the match zone.
We just generated the same whitelist as before, but without having to perform any active training. This might be especially useful for people using naxsi on a frequently updated web application, as it allows you to perform “meta training” (woot, buzzword) from an initial configuration. It is as well very usefull if you are using a framework (wordpress / drupal etc.) that uses predictable argument names in forms.
I’m currently exploring seriously this idea for future releases of the learning daemon, as it might allow us to perform learning automagically.