spike & naxsi 0.55

Hello there,

In the tradition of an impressive publication rate, here is the yearly news from naxsi, and this time for greater good.

After a mere 296 unique user requests to give naxsi a web interface, we finally got the message and started working on it.

Spike, only 3 years late ;p

We have (with 8ack.de consent ofc) taken over the “spike” (spike@bitbucket) project. Spike is a originally a python/flask rule builder for naxsi, and we intend to extend this base :
- rule(s) validation & explanation
- Whitelist creation assistance
- Enhance the rule edition capabilities
- Have a piece of software on which we can more easily accept external contributions : it’s developed in flask/python, and does not offer serious surface attack.

spike : rules sandbox

You can have a first look at spike here spike-live (legacy 8ack version) or spike-github (with new features), but most importantly, we need your help : We started to open issues as feature requests on the project github spike-requests, and we are looking for contributors :

  • You are using naxsi ? Maybe you have some great features suggestions !
  • You know python and/or flask ? Come and join us :)

Naxsi 0.55

Naxsi 0.55 is on its way out as well, with one simple but usefull addition : support for “raw body”, or the ability to match strings / regex against the full, raw body.

This is intended as a way to apply rules to things won’t parse : java serialized objects, XXE injections, and other funny things we don’t even think about writing parsers for. It follows the same logic as other zones, just adding a zone “RAW_BODY” for which you can write rules for. Those rules will be applied on request which naxsi can’t parse (id:11)

RAX_BODY example rule:

MainRule "id:4242" "s:DROP" "mz:RAW_BODY" "str:CLI-connect";

$ curl -H "content-type: ratata" --data "CLI-connect"
405 Not Allowed

Have fun !

naxsi 0.54rc2 & Libinjection


A few days^Wweeks ago, we released NAXSI 0.54rc0,0.54rc1,0.54rc2, which mostly brings support for libinjection, and fix some longstanding issues :)
Libinjection is developped by client9 (Nick Galbreath), and I like it.
Rather than working on regular expressions (which is dead end), it is really tokenizing the string to see if it could actually be an SQL snippet. You can find one of the presentations of libinjection here : http://fr.slideshare.net/nickgsuperstar/libinjection-and-sqli-obfuscation-presented-at-owsap-ny

Libinjection, even if probably not bullet-proof (yet?), seems to be a good complement to naxsi core approach.
It is a good complement, because I hope it will help make learning phase easier, by helping reducing false positives and false negatives.

How is libinjection integrated into naxsi ?
Libinjection is integrated as an internal rule to naxsi, which means that – if explictely enabled – all input processed will be passed to libinjection, as it does for other internal/core rules. If libinjection positively detects input as either an SQLi or an XSS, it will increase specific scores (LIBINJECTION_SQL and LIBINJECTION_XSS) by 8 at each match.
Then, it’s the user’s duty to write his own “checkrules” to allow naxsi to take action depending on libinjection’s result.

As it’s still in early testing, you need to explicitely enable libinjection sql/xss detection, which can be done in several ways :
– using directives LibInjectionSql/LibInjectionXss (libinjection_sql/libinjection_xss for those who prefer nginx-style)
– using dynamic flags (ie. to be enabled/disabled at runtime depending on conditions) naxsi_flag_libinjection_sql and/or naxsi_flag_libinjection_xss

enable libinjection xss/sql by default

location / {
SecRulesEnabled; #enable naxsi
LibInjectionXss; #enable libinjection for XSS
LibInjectionSql; #enable libinjection for SQLi
BasicRule wl:17 "mz:$URL:/|$ARGS_VAR:x"; #wl libinjection_sql on VAR 'x'
BasicRule wl:18 "mz:$URL:/|$ARGS_VAR:y"; #wl libinjection_xss on VAR 'y'



on-demand activation

#ifisevil, just for demo.
if ($uri ~ "_vuln.php") {
set $naxsi_flag_libinjection_sql 1;
set $naxsi_flag_libinjection_xss 1;
location / {
SecRulesEnabled; #enable naxsi
# libinjection disabled except for uris _vuln.php

Have fun, and feedback is always welcome :)


Measuring quality !

Hello fellow script kiddies bullies !

Over the last few weeks, after adding a lot of features in naxsi-core (DROP target, support for RX matchzones, various bugfixes and minor improvments), time has come to focus on more testing.

In fact, the last 0.53 release has been quickly tagged as “draft”, because a minor code mistake made it through our own pre-release testing (unit tests and manual review). This release, because of a missing “break” in a switch statement, lead to incorrect behavior in disabled rules for specific zones. This means that it was time again to focus on this, because in such kind of software, buggy release is hardly acceptable.

First, I got my hands on scan.coverity.com. Coverity is a static analysis tool, and they took a great initiative : making their product available, on a SaaS model, for OSS projects. I have already encountered coverity in $DAYJOB, and felt it’s a really good solution, so I was happy. Coverity helped me fix a few (~10) missing post-allocation null checks. I’m happy that no “serious” (read: security-wise dangerous) issues were encountered by coverity, it’s rewarding for my paranoia ;)

Secondly, we worked on existing unit testing. Naxsi unit tests rely on Test::Nginx::Socket (yet another great agentzh project !) to ensure that it behaves correctly :
* testing corner cases
* ensuring features work as expected

We as well extend those tests to ensure mistakes don’t bring security risks, by mostly relying on two third parties frameworks/tests :
* Ivan’s “Protocol level evasion” [1]
* SQLmap’s tamper patterns [2]

But, if release 0.53 has to be tagged back as “draft”, it means that those test do not provide enough coverage [3] !
Thus, how to evaluate code coverage ? gcov [4] and its extension, lcov, is the direction we chose to do so (alternatives do exist of course). Using these, one is able to build software with gcov library, and then, when you run unit tests, gcov will create files containing information on which functions / branches / lines of code were actually executed.

Note: For those wishing to use gcov with their own nginx module, remember to disabled master process in nginx, else gcov will only analyse master process ;)

Later, with lcov, it is possible to build an html report, showing code covergage for each part :

lcov html report

lcov html report

We now achieve nearly 90% code coverage by unit tests, which feels enough for now … (future will tell !)
As you can see, a lot of “branches” are tagged as untested, because a lot of checks are here to cover cases that are hard to reproduce (ie. ngx_pcalloc() failing), as well as a lot of protocol abnormalities. However, we are going to continue to improve unit tests on protocol abnormalities !





naxsi : regex matchzones introduction

Regular expressions matchzones is a feature we recurrently get asked about, and it’s now here !

Regular expression matchzones is the ability to use regular expressions in the “mz:” part of a whitelist, along the regular fixed strings used so far. The feature will be in next release, but needs to be tested !

Why ?

When an application has partially unpredictable URLs or argument names, and it can make naxsi setup a real pain, like :

http://mysite.com/customer/83472/edit?comment-31=it's good

As both the parameter name and the url are unpredictable, you had to :

* Create a specific location in nginx for “/customer/*” and apply whitelist to all the GET args there
* OR apply whitelist to all the GET args

Both are either complex to maintain, or opening a potential security risk.

Regex support in matchzones

Regular expression matchzones can be used like this :

BasicRule wl:1999 "mz:$ARGS_VAR_X:^comment-[0-9]+$|$URL_X:^/customer/[0-9]+/edit$";

Instead of building hashtables from the match zone, it will compile regular expressions and apply whitelist if it matches.

Regular enabled matchzone keywords are :

  • $URL_X

The keywords and syntax of matchzones remains the same as the classical ones, and are case insensitive as well.

The branch (0.52 rc) can be found here for testing :

svn checkout http://naxsi.googlecode.com/svn/branches/naxsi-rc-0.52 naxsi-rc-0.52


Remember that whitelists must be as restrictive as possible, as this rule :

BasicRule wl:X "mz:$ARGS_VAR_X:comment";

Will match on :



Whitelist specific rule in generic variable name :

BasicRule wl:X "mz:$ARGS_VAR_X:^foo_[0-9]+_$|NAME";

Whitelist special rule in very generic variable name :

BasicRule wl:X "mz:$URL_X:^/foo$|ARGS|NAME";

Escape special characters in mz :

BasicRule wl:X "mz:$ARGS_VAR_X:^foo\[[0-9]+$|NAME";

nx_intercept / nx_extract are dead, long life to nx_util

Hello folks !

Happy year 2013 (3 months late only !)

Yes, it’s been a long time I didn’t updated this blog, but I’ve been busy on naxsi this year :)
Lots of good things happened : more users, a few conferences around (fosdem, rmll, phdays, SL etc.), more good ideas, more code, more stuff !

We had a lot of issues in the past with nx_intercept and nx_util. What seemed a great idea (live learning) was in fact more a source of trouble for a lot of users, as it’s too prone to configuration errors. So, for the masses (don’t feel offended, please), we decided that live-learning is something we are going to stop pushing, in favor of learning from logs. You might ask why ?

1°) Live learning is error-prone:
Yes, live learning induces too many possible configuration errors : location, redirect to live-learning daemon, database issues etc. It is something annoying that 9 out of 10 questions about learning where in fact related to configuration errors. This means that the software is too complicated to setup.

2°) Live learning has a high cost:
Live learning requires runtime processing. nx_intercept was developped in python with twisted, and we were not satisfied with performances at high load. It can handle far less RPS than nginx+naxsi, so it’s a bottleneck. Naxsi aims at keeping high performance, so this was wrong.

3°) Live learning is the only reason we required a web “server”:
Live learning required that we provided a web server to the user. It is a source of annoyance for admins, and I do understand that, nobody wants to expose one more service, especially for this kind of purposes.

4°) Last but not least, we can achieve live-learning without live-learning:
Working directly from log files in runtime (a la tail) is possible, and seems to be a more robust solution in our opinion.

So, what’s coming next ?
We will pull nx_intercept/nx_extract out of the releases, in the favor of nx_util, a much more powerful tool.
Here are the main characteristics of nx_util :
* Command line only
* No dependencies
* Relies on sqlite only (no live learning, no concurrency, no concurrency = no need for mysql)
* Is able to generate whitelists (as did nx_extract)
* Is able to integrate naxsi_extensive_log output into whitelists generation, making easier to decide wether a whitelist is legitime or not. (see https://code.google.com/p/naxsi/wiki/RuntimeModifiers)
* Is able to generate HTML reports (as did nx_extract) to flat HTML files
* Supports filters for filtering incoming data

Why ?

We wanted the tools around naxsi to be production ready, so skipping dependencies such as twisted is a good move. Twisted suffers too much from difference between distro (centOS, the 90′s called, they want their python version back!)

Going from web interface to command line without functionality loss makes it easier for admins : cron jobs for automated reports sending etc.

What is added ?

One of the main issues with “learning” is true positives, and one of the workarounds is implementing a filter mechanism that will allow users to easily find events from legitimate users. For example, lots of website will mainly focus on people from one language. nx_utils filters (provided with “-f” argument) allows that, ie -f “country = FR or country = BE” will only import events coming from french/belgian IPs. But I will let you have a look at filters by yourself.

When it comes to day-to-day usage, you want to be able to handle a lot of files at the same time, and do not want to have to copy your logfiles, unzip them and so on to make reports on them. nx_util can treat multiple files at the same time, including gzipped files. We added as well support for regular expressions in file names.

Regarding “live” learning, nx_util can read log lines from stdin as well, so you can tail -f mylog | nx_util -o

But I let you discover all that by yourself, you can have a look at the actual manpage here :
- https://code.google.com/p/naxsi/wiki/NxUtil_man
Or directly in the SVN. Expect a release in the coming days !

As well, expect naxsi to move to a dedicated website soon, we are going to leave google code !


Abusing naxsi to fulfill your goals

Naxsi is pretty stupid, and this is an something you can take advantage of, as it makes it hacker-friendly. The big deal, with naxsi as with any other positive-approach WAF, is training.

As explained earlier, you can trick naxsi to generate specific whitelist while in learning mode. For example, if you add to the location you want to train a rule like this :

BasicRule id:0 "str:123FREETEXT" "mz:ARGS|BODY|URL" s:BLOCK;

Then, whenever you enter the string “123FREETEXT” in a form field, naxsi will generate a whitelist for id:0 (which means “allow everything”). Let’s say you have a website wich allow users to publish comments, and you want users to be able to put anything in comments (because you trust your code). Then, during learning phase, input 123FREETEXT in the user comment field.

Naxsi will thus generate a whitelist like :

BasicRule wl:0 "mz:$BODY_VAR:comment|$URL:/comment-article.php"

Once this whitelist is used, naxsi will allow argument “comment” of POST requests to the url “/comment-article.php” to contain anything without bocking it.

Fine, but in some case, you might not want to allow anything, but allow a restricted subset of uncommon characters.

In this same comment page, users can as well input their postal address, to get free gifts (why not!). In this “address” field, you don’t want to allow “any” character, as it’s only a postal address. No need to allow “<” or “>” for example. But, you want to allow users to use this subset of characters, as they are legitimate in a postal address :

' ; , . # ( )

But having to type them every time is quite annoying, and might lead to human mistakes. Instead, you could just add a new rule to your to-be-trained location :

BasicRule id:4000 str:123address s:BLOCK

Thus inputting, 123address in this very same form will make naxsi’s learning mode generate a whitelist like this :

BasicRule wl:4000 "mz:$BODY_VAR:address|$URL:/comment-article.php"

Not very useful right ? Actually, totally useless, because rule 4000 doesn’t exist in naxsi core rule set. But, if, after you finished your training you do a little magical sed like this (on the output of learning mode daemon) :

cat /tmp/rules.tmp | sed  "s/wl:4000/wl:1008,1010,1011,1013,1015,1016/g" > /etc/nginx/naxsi-mysite.rules

Then it makes much more sense, as we just told naxsi to allow :

1008: ;

1010: (

1011: )

1013: '

1015: ,

1016: #

Better, but still not that good.

Now, let’s see training from a different point of view. Within this same page, if, by chance (or because it makes sense) your “address” fields in your forms are named in a predictable way, like “address_foobar”, you can make training MUCH easier.

As naxsi is aware of its own dumbness, it will parse argument names in the same way as argument content (mainly to avoid bypass because of parser bugs – both in nginx/naxsi or in the backend webserver). So in this specific case, you can simply setup a rule like :

BasicRule wl:4000 str:address s:BLOCK;

Then, whenever you perform a post on the form, regardless of the variable content, this rule will be triggered if the POST variable (name or content) contains the string “address”, and naxsi will generate a whitelist like :

BasicRule wl:4000 "mz:$ARGS_VAR:address_foobar|$URL:/comment-article.php|NAME";

The “|NAME” at the end of the whitelist is quite important, as it means that this whitelist is true, but only for the variable name, and not its content. Here, we will apply the same transformation as before (replace wl:4000 with IDs 1008,1010,1011,1013,1015,1016), and remove the |NAME in the match zone.

We just generated the same whitelist as before, but without having to perform any active training. This might be especially useful for people using naxsi on a frequently updated web application, as it allows you to perform “meta training” (woot, buzzword) from an initial configuration. It is as well very usefull if you are using a framework (wordpress / drupal etc.) that uses predictable argument names in forms.

I’m currently exploring seriously this idea for future releases of the learning daemon, as it might allow us to perform learning automagically.

Having fun with unit testing and sulley

While developing naxsi, code issues are always in my mind.
It’s so easy to make a bug, and the bigger the project grows, the harder it is to eyeball them !

So, since the very beginning of naxsi’s development, I have been creating extensive testing units based on test::nginx+prove, an awesome module in perl by the even more awesome agentzh (agentzh.org).

Test case are basically presented like this :

=== WL TEST 1.0: [ARGS zone WhiteList] Adding a test rule in http_config (ARGS zone) and disable rule.
--- http_config
include /etc/nginx/naxsi_core.rules;
MainRule "str:foobar" "msg:foobar test pattern" "mz:ARGS" "s:$SQL:42" id:1999;
--- config
location / {
DeniedUrl "/RequestDenied";
CheckRule "$SQL >= 8" BLOCK;
CheckRule "$RFI >= 8" BLOCK;
CheckRule "$TRAVERSAL >= 4" BLOCK;
CheckRule "$XSS >= 8" BLOCK;
index index.html index.htm;
BasicRule wl:1999;
location /RequestDenied {
return 412;
--- request
GET /?a=foobar
--- error_code: 200

So, we specify a configuration file for nginx (and naxsi), a HTTP request, and an expected HTTP response code.
I use this during development, and before releases, to be sure that new features/improvement do not break existing code/features.
This as already saved my a huge number of times, but that’s not the point today ;)

Now, we come to sulley (https://github.com/OpenRCE/sulley), which is a fuzzing framework that looks pretty nice.
It basically allows you to describe a simple grammar, with fuzzable components.

The main idea here is to push naxsi/nginx’s parser into their deepest fear, to provide a reasonable insurance that there is not too much bugs into the parser. As you might now, for a WAF, parser bug = bypass (most of the time ;p).

You can generate fuzzed output with sulley like this :

from sulley import *
import sys

s_initialize("HTTP VERBS BASIC")
s_group("verbs", values=["GET", "HEAD"])
if s_block_start("body", group="verbs"):
s_static(" ")
s_delim(" ")
s_delim("> ")

Here, we are generating a fuzzed HTTP/1.0 request, with a “fixed” string : “> ” at the end of URL. The goal here is to be able to predict the result of the request. As we included a forbidden character (>), we know that the request must be blocked by naxsi.
We can, from this, generate a unit test :

--- main_config
working_directory /tmp/;
worker_rlimit_core 25M;
--- http_config
include /etc/nginx/naxsi_core.rules;
--- config
location / {
DeniedUrl "/RequestDenied";
CheckRule "$SQL >= 8" BLOCK;
CheckRule "$RFI >= 8" BLOCK;
CheckRule "$TRAVERSAL >= 4" BLOCK;
CheckRule "$XSS >= 8" BLOCK;
index index.html index.htm;
location /RequestDenied {
return 400;
--- raw_request eval
"GET /index.html> HTTP/1.0

--- error_code: 400

Because of fuzzing level applied, I cannot even be sure that the request will be a valid one, so I configured naxsi to return 400 on /RequestDenied, meaning I am looking either for “bad request” by nginx or denied request by naxsi.

Using this global mechanism and a small script, I have generated thousands of fuzzed unit tests for naxsi.

Unfortunately, I did not find any bugs in naxsi yet using this, but I will probably continue to play around aggressive unit tests :)

LearningMode tricks !


I recurrently get questions about LearningMode, and how it’s possible “to have some users in learning mode, but not the others”.

Here is my “normal” nginx/naxsi configuration :

server {
proxy_set_header Proxy-Connection "";
listen *:80;
access_log /tmp/nginx_access.log;
error_log /tmp/nginx_error.log debug;
server_name blog.memze.ro;

location / {
include /etc/nginx/memzero.rules;
proxy_set_header Host blog.memze.ro;

location /RequestDenied {

# They told me I can be anything, so I decided to be teapot.
return 418;


If I want to add learning mode for a few users only, I can add another “server” block like this :

server {

proxy_set_header Proxy-Connection "";
listen *:80;
access_log /tmp/learn_nginx_access.log;
error_log /tmp/learn_nginx_error.log debug;
server_name learning.blog.memze.ro;

location / {
# Require user / password to access to learning mode.

auth_basic "Restricted";
auth_basic_user_file /var/www/user.pass;
LearningMode; # Enable learning mode only here :p
include /etc/nginx/memzero.rules;
proxy_set_header Host blog.memze.ro;

location /RequestDenied {



Then people requesting learning.blog.memze.ro, with proper credentials will be able to use the learning mode. We could instead decide to have a per-ip restriction, using allow/deny nginx’s directives.

Naxsi License

Just a few words,

This morning I was presenting NAXSI at JSSI 2012, and its experimentation with charlie hebdo website during the attacks from November 2011.

I had several people asking whether Naxsi will stay open-source, and whether we are planning to have it move to proprietary licence some day.

First of all, it’s important to state that Naxsi is MY software (Thibault Koechlin). NBS System (my company – and the main reason I have coded Naxsi) is helping the project (a LOT) and using it. They have no right to decide for a licence change. This decision is to be made by me, and me only.

And even if it wasn’t the case, NBS System is a 100% Open Source company. We have NetBSD, FreeBSD commiters working here, we fueled OpenVPN project a lot when the project was younger, in a word, we are PRO OSS, so no chance for this to happen ;)

So NO, Naxsi will not have a strongly different license ever. It will stay in a GPL V2 or equivalent format (maybe V3/V4 one day if this make sense) but the grounds will remain the same : free of charge & opensource.

Of course, I & NBS are professionals, willing to earn money to live and to have the freedom to develop good softwares & offers. Hence, we will propose Consulting / Training and customization of Naxsi. If anyone wish to make a commercial use of Naxsi (like include it in an appliance), we may grant licenses to exploit the product this way.

Naxsi, new learning mode and other stuff


I’m pretty excited as I started to work on naxsi’s new learning daemon. No need to tell, the actual one (and even the previous one) is just a PoC when compared to next version ;)

The new learning daemon will rely on twisted, a python event-driven networking engine module, capable of offering a real http daemon to learning mode daemon, which name will be nx_intercept.

As well, we’ll switch from sqlite to MySQL. Both changes will allow nx_intercept to be scalable. Naxsi is designed with performances in mind, no reason for nx_intercept to be a bottleneck !

nx_intercept will be way more usefull than the actual learning daemon, because you can use it in learning mode, but as well in production mode. While learning mode will work as it used to be, in production mode you can easily provide statistics about who are the bad guys (the one getting blocked by naxsi), how often etc, without having to parse logs ;)

Being able to use it in production mode provide some benefits. Let’s say you are facing some serious attacks and you’d like to investigate some blocked requests, you can turn nx_intercept back to learning mode, where it will log blocked http requests. For manual review. It is also very usefull if – let’s say – there is a new version of the website being deployed and it created some new false positives you want to investigate on :)

Now comes the “how will I use nx_intercept”. Exit the crappy web interface first of all. I suck at HTML and I didn’t found webdev willing to contribute yet. nx_intercept will come with another http daemon nx_extract, that does exactly the opposite of what nx_intercept does : it will extract whitelists and exceptions from database. nx_intercept writes exceptions (and potentially associated http requests), nx_extract extract them and transform them into whitelists.