Olivier's Blog: Playing with FreeBSD packet filter state table limits

Objective

I've got a very specific needs: Selecting a firewalls to be installed between large number of monitoring servers and a big network (about one million of equipment).
This mean lot's of short SNMP (UDP based) flows: I need a firewall able to manage 4 millions state table entries but don't need important throughput (few gigabit per second is enough).
Short look on the datasheet marked:

Juniper SRX 3600: 6 millions concurrent sessions maximum and up to 65Gbps (marketing bullshit: Giving a value in Gbps is useless)
Cisco ASA 5585-X: 4 millions concurrent sessions maximum and up to 15Gbps (same marketing bullshit unit as Juniper, marketing department seems stronger than engineering)

I'm not looking for such big throughput, then how about performance vs maximum number of firewall states on a simple x86 servers ?

I will do my benches on a small Netgate RCC-VE 4860 (4 cores ATOM C2558, 8GB RAM) under FreeBSD 10.3: I'm rebooting it between each bench, and do a lot's of bench, then I need an equipment with a short POST BIOS time.
My performance unit will be the packet-per-second with smallest-size packet (64 bytes Ethernet frame size) generated at maximum line-rate (1.48Mpps if Gigabit interface, 14.8Mpps if 10 Gigabit interface).

Performance with default pf parameters

By default pf uses these maximum number of state values:
[root@DUT]~# pfctl -sm
states hard limit 10000
src-nodes hard limit 10000
frags hard limit 5000
table-entries hard limit 200000
[root@DUT]~# sysctl net.pf
net.pf.source_nodes_hashsize: 8192
net.pf.states_hashsize: 32768

This mean it manages 10K session maximum with a size of pf states hashsize of 32768 (no idea of the unit).

A very simple pf.conf will be used:
[root@DUT]~# cat /etc/pf.conf
set skip on lo0
pass

I will start by benching pf performance impact regarding number of states: between 128 to 9800.
For one unidirectional UDP flow pf will create 2 session entries (one for each direction).
As example, with a a packet generator like netmap's pkg-gen, we can ask for generating a range of 70 sources IP addresses and 70 destinations addresses: This will give total of 70*70=4900 unidirectional UDP flows (for 9800 pf states).

From theory to practice with pkt-gen:
pkt-gen -i ncxl0 -f tx -l 60 -d 198.19.10.1:2000-198.19.10.70 -D 00:07:43:2e:e5:90 -s 198.18.10.1:2000-198.18.10.70 -w 4

And during this load, we check number of current states:

[root@DUT]~# pfctl -si
Status: Enabled for 0 days 00:00:19 Debug: Urgent

State Table Total Rate
current entries 9800
searches 13777196 725115.6/s
inserts 9800 515.8/s
removals 0 0.0/s

Great: theory match practice, now I can start to generate multiple pktgen configuration (128, 512, 2048, 9800 states) on my bench script and run a first session:

olivier@manager:~/netbenches/Atom_C2558_4Cores-Intel_i350 % ~/netbenches/scripts/bench-lab.sh -f bench-lab-2nodes.config -n 10 -p ../pktgen.configs/FW-states-10k/ -d pf-sessions/results/fbsd10.3/

BSDRP automatized upgrade/configuration-sets/benchs script

This script will start 40 bench tests using:

- Multiples images to test: no

- Multiples configuration-sets to test: no

- Multiples pkt-gen configuration to test: yes

- Number of iteration for each set: 10

- Results dir: pf-sessions/results/fbsd10.3/

Do you want to continue ? (y/n): y

Testing ICMP connectivity to each devices:

192.168.1.3...OK

192.168.1.9...OK

Testing SSH connectivity with key to each devices:

192.168.1.3...OK

192.168.1.9...OK

Starting the benchs

Start configuration set: pf-statefull

Uploading cfg pf-session/config//pf-statefull