
cardigliano at ntop
Jul 2, 2012, 9:56 AM
Post #8 of 8
(410 views)
Permalink
|
|
Re: Trying to trace poor performance on PF_RING enabled box
[In reply to]
|
|
Jesse thank you, now we have enough info for trying to reproduce the issue, we will let you know as soon as we have news. Regards Alfredo On Jul 2, 2012, at 6:42 PM, Jesse Bowling wrote: > I confirmed that running older PF_RING version with latest snort does not exhibit this behavior after 30 minutes of testing, and performance is much, much improved and in line with my expectations. I used: > > svn co -r {2011-11-28} https://svn.ntop.org/svn/ntop/trunk/PF_RING/ PF_RING-2011-11-28 > > It would appear that while the issue seems to (thus far) only appear when using snort, it is an issue within PF_RING. I do NOT know that this is the latest version that doesn't exhibit this behavior; I chose this version because that was a version known to not have this issue. There may be later versions that also do not have this problem. > > Although I'm over my immediate hurdle, I would be happy to contribute any debug info that might help. > > Thanks, > > Jesse > > On Mon, Jul 2, 2012 at 12:04 PM, Jesse Bowling <jessebowling [at] gmail> wrote: > I forgot to include some of the OS info: > > Linux sensor-test.american.edu 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux > [bowling [at] sensor-tes libpcap]$ cat /etc/redhat-release > Red Hat Enterprise Linux Server release 6.3 (Santiago) > > Cheers, > > Jesse > > > > On Mon, Jul 2, 2012 at 9:06 AM, Jesse Bowling <jessebowling [at] gmail> wrote: > Thank you for your interest Alfredo. > > PF_RING was built with no additional arguments; that is to say that while doing 'make' from the base of the build fails to compile the ixgbe drivers, I stepped into and did a make && sudo make install for: kernel, driver/PF_RING_aware/e1000e/e1000e-2.0.0.1/src, lib, libpcap, userland/tcpdump. > > After that, install 0.62 DAQ from snort, back to PF_RING/userland/snort/pf-ring-daq-module, then libdnet, finally snort. Commands and options included below. > > I've replicated this issue with stable and SVN versions of PF_RING, as well as stable snort and RC snort (all combos). The length of time for the issue to show up depends on the amount of traffic. I found that at 300+ Mb of traffic it takes 10-15 minutes while at 200 Mb's of traffic it took 25 minutes for the packet drop issue to surface. Watching top for the running processes shows that the amount of 'sy' time increases slowly during that time; around the time 'sy' is reported in the 30% range consistently PF_RING and snort start reporting increasingly large drops. > > Most of the snort configuration is set in the conf files, but the master one looks like: > > config daq:pfring > config daq_dir:/usr/local/lib/daq > config daq_mode:passive > config daq_var:cluserid=44 > > I use some other variables in the snort config to ensure that alert data is written to it's own directory per instance, etc. Snort is then started with: > > snort -c /etc/snort/snort.inst_${INST}.conf --pid-path /nsm/snort/inst_${INST} -D > > I am aware of one other using experiencing this with much better hardware and using the ixgbe drivers, so I suspect the issue is either with snort of PF_RING itself. While I'm happy to include this info so you can replicate it, is there anything you can point me to so I can help narrow down where the issue is? > > Thank you again for the followup on this, and please do let me know on- or off-list what I can do to assist troubleshooting this issue. > > Configs: > > PF_RING > cd /usr/local/src > svn co https://svn.ntop.org/svn/ntop/trunk/PF_RING/ > chown -R ${non-root-user} PF_RING > su - ${non-root-user} > cd /usr/local/src/PF_RING > cd kernel > make > sudo make install > cd /usr/local/src/PF_RING/drivers/PF_RING_aware/intel/e1000e/e1000e-2.0.0/src > make > sudo make install > cd /usr/local/src/PF_RING/userland > make > cd /usr/local/src/PF_RING/userland/lib > sudo make install > cd /usr/local/src/PF_RING/userland/libpcap > sudo make install > cd /usr/local/src/PF_RING/userland/tcpdump-4.1.1 > sudo make install > vi /etc/modprobe.d/pf_ring.conf > options pf_ring transparent_mode=2 enable_tx_capture=0 min_num_slots=16384 > vi /etc/ld.so.conf.d/pfring.conf > /usr/local/lib > rm /etc/ld.so.cache > ldconfig > At this point (most of) what we need for PF_RING should be ready to go > Let's build the DAQ for snort! > Download latest from snort site to /usr/local/src, and tar -zxf > LIBS="-lpcap -lpfring" LDFLAGS="-lpcap -lpfring" ./configure --with-libpcap-includes=/usr/local/include --with-libpcap-libraries=/usr/local/lib > make > make install > Add PF_RING daq > cd /usr/local/src/PF_RING/userland/snort/pfring-daq-module > HOME=/usr/local/src ./configure > make > sudo make install > Build libdnet > wget http://libdnet.googlecode.com/files/libdnet-1.12.tgz > tar xvfz libdnet-1.12.tgz > ./configure > make > sudo make install > ln -s /usr/local/lib/libdnet.1.0.1 /usr/lib64/libdnet.1 #Stupid > Finally: Build snort! > wget -O ./snort.src.tar.gz http://www.snort.org/downloads/1631 > tar -zxf snort.src.tar.gz > LIBS="-lpcap -lpfring" LDFLAGS="-lpcap -lpfring" ./configure --with-libpcap-includes=/usr/local/include --with-libpcap-libraries=/usr/local/lib --with-dnet-includes=/usr/local/include --with-dnet-libraries=/usr/local/lib --disable-ipv6 --disable-active-response --disable-react > make > sudo make install > Make sure this gives you output that includes pfring: > snort --daq-dir=/usr/local/lib/daq --daq-list > > > > On Sat, Jun 30, 2012 at 8:46 AM, Alfredo Cardigliano <cardigliano [at] ntop> wrote: > Jesse > Can you please give us more info about your configuration? Are you using PF_RING-aware or DNA drivers? > Please recap and give us the exact configuration and the commands you are using in order to help us reproducing the issue. > > Best Regards > Alfredo > > On Jun 29, 2012, at 7:14 PM, Jesse Bowling wrote: > >> I believe I've narrowed the issue to a drain on system resources, evidenced in an increasing amount of CPU % spent in kernel, rather than handling user processes. Adding more slots to PF_RING does not seem to affect how long the snort processes can run before system% climbs into the 30-40% range and packets start getting dropped. This usually takes between 5 and 10 minutes to show up. >> >> Any help for profiling this more specifically to see what's causing this? I'd like to start by narrowing down if this is in snort, DAQ, PF_RING, etc. Anyone got a good test scenario that could lead me in the correct direction? >> >> Thanks, >> >> Jesse >> >> On Fri, Jun 29, 2012 at 10:04 AM, Jesse Bowling <jessebowling [at] gmail> wrote: >> As a follow up, today I experienced this: >> >> [root [at] sensor-tes ~]# pkill snort >> [root [at] sensor-tes ~]# >> Message from syslogd [at] sensor-tes at Jun 29 09:56:34 ... >> kernel:Uhhuh. NMI received for unknown reason b1 on CPU 0. >> >> Message from syslogd [at] sensor-tes at Jun 29 09:56:34 ... >> kernel:You have some hardware problem, likely on the PCI bus. >> >> Message from syslogd [at] sensor-tes at Jun 29 09:56:34 ... >> kernel:Dazed and confused, but trying to continue >> >> Unfortunately, the kernel was not able to continue and a reset was required. This was with the DNA driver, and two instances of snort running. >> >> Cheers, >> >> Jesse >> >> >> On Thu, Jun 28, 2012 at 11:00 PM, Jesse Bowling <jessebowling [at] gmail> wrote: >> Hi, >> >> I've built PF_RING from svn (ver 5519), and installed pf_ring, the e1000e 2.0.0.1 driver, pfring-daq-module, and snort 2.9.2.3 with DAQ 0.62. >> >> The hardware is very modest (dual-core Intel(R) Xeon(R) CPU L5240 @ 3.00GHz, 1920288 kB of RAM, Linux sensor-test 2.6.32-279.el6.x86_64 #1 SMP Wed Jun 13 18:24:36 EDT 2012 x86_64 x86_64 x86_64 GNU/Linux, Ethernet controller: Intel Corporation 82571EB Gigabit Ethernet Controller (rev 06)), but so is the network traffic level I'm testing against (~275 Mb/s). >> >> I've loaded PF_RING with 'options pf_ring transparent_mode=2 enable_tx_capture=0 min_num_slots=16384" and snort is running 7353 rules. >> >> What I'm experiencing is that PF_RING starts reporting dropped packets after only about a minute, and the percentage of dropped packets continues to grow over time. Memory also continues to be used. This is my first 5.x build; my previous builds in the 4.x series, while on much better hardware, gave significantly better performance. >> >> I would like to get the groups advice on how I can profile exactly which piece of the process is causing the drops (PF_RING, driver, NIC, kernel, snort, etc). >> >> Can anyone shed any light on whether snort is reporting the same drops that PF_RING is reporting, or should these drops be considered additional (i.e., PF_RING loses 10%, and snort loses another 15% of the remaining 90%)? I'm leaning towards the fact that the stats reported by snort with 'kill -USR1' are essentially the same as the PF_RING stats, only slightly different due to timing, as perfmonitor output is at markedly different levels. >> >> For instance: >> >> Kill -USR1: >> Jun 28 22:20:43 sensor-test snort[4631]: Received: 24907256 >> Jun 28 22:20:43 sensor-test snort[4631]: Analyzed: 24907256 (100.000%) >> Jun 28 22:20:43 sensor-test snort[4631]: Dropped: 5436551 ( 17.917%) >> >> Perfmonitor: >> Jun 28 22:21:09 ,27.636 (drop percentage),189.096 (traffic in Mb/s) >> >> Can anyone give me advice on this, or point me in the right direction? >> >> Thanks, >> >> Jesse >> >> >> Gory details of configuration and perf data, for the interested: >> >> ethtool -g eth2 >> Ring parameters for eth2: >> Pre-set maximums: >> RX: 4096 >> RX Mini: 0 >> RX Jumbo: 0 >> TX: 4096 >> Current hardware settings: >> RX: 4096 >> RX Mini: 0 >> RX Jumbo: 0 >> TX: 256 >> >> # ethtool -k eth2 >> Offload parameters for eth2: >> rx-checksumming: off >> tx-checksumming: off >> scatter-gather: off >> tcp-segmentation-offload: off >> udp-fragmentation-offload: off >> generic-segmentation-offload: off >> generic-receive-offload: off >> large-receive-offload: off >> >> # ethtool -c eth2 >> Coalesce parameters for eth2: >> Adaptive RX: off TX: off >> stats-block-usecs: 0 >> sample-interval: 0 >> pkt-rate-low: 0 >> pkt-rate-high: 0 >> >> rx-usecs: 1000 >> rx-frames: 0 >> rx-usecs-irq: 0 >> rx-frames-irq: 0 >> >> tx-usecs: 0 >> tx-frames: 0 >> tx-usecs-irq: 0 >> tx-frames-irq: 0 >> >> rx-usecs-low: 0 >> rx-frame-low: 0 >> tx-usecs-low: 0 >> tx-frame-low: 0 >> >> rx-usecs-high: 0 >> rx-frame-high: 0 >> tx-usecs-high: 0 >> tx-frame-high: 0 >> >> PF_RING reports: >> >> Stats for /proc/net/pf_ring/4631-eth2.6 >> Total: 30374427 >> Lost: 5436516 >> Percentage: 17.89833269941190989300 >> >> /proc/meminfo: >> MemTotal: 1920288 kB >> MemFree: 453568 kB >> >> Ifconfig eth2|grep RX >> RX packets:30374844 errors:0 dropped:12 overruns:0 frame:0 >> RX bytes:26681981935 (24.8 GiB) TX bytes:0 (0.0 b) >> >> Snort reports: >> >> Jun 28 22:08:18 sensor-test snort[4631]: Packet I/O Totals: >> Jun 28 22:08:18 sensor-test snort[4631]: Received: 246089 >> Jun 28 22:08:18 sensor-test snort[4631]: Analyzed: 246089 (100.000%) >> Jun 28 22:08:18 sensor-test snort[4631]: Dropped: 0 ( 0.000%) >> Jun 28 22:08:18 sensor-test snort[4631]: Filtered: 0 ( 0.000%) >> Jun 28 22:08:18 sensor-test snort[4631]: Outstanding: 0 ( 0.000%) >> -- >> Jun 28 22:08:54 sensor-test snort[4631]: Packet I/O Totals: >> Jun 28 22:08:54 sensor-test snort[4631]: Received: 1548112 >> Jun 28 22:08:54 sensor-test snort[4631]: Analyzed: 1548112 (100.000%) >> Jun 28 22:08:54 sensor-test snort[4631]: Dropped: 29673 ( 1.881%) >> Jun 28 22:08:54 sensor-test snort[4631]: Filtered: 0 ( 0.000%) >> Jun 28 22:08:54 sensor-test snort[4631]: Outstanding: 0 ( 0.000%) >> -- >> Jun 28 22:09:46 sensor-test snort[4631]: Packet I/O Totals: >> Jun 28 22:09:46 sensor-test snort[4631]: Received: 3278152 >> Jun 28 22:09:46 sensor-test snort[4631]: Analyzed: 3278152 (100.000%) >> Jun 28 22:09:46 sensor-test snort[4631]: Dropped: 52990 ( 1.591%) >> Jun 28 22:09:46 sensor-test snort[4631]: Filtered: 0 ( 0.000%) >> Jun 28 22:09:46 sensor-test snort[4631]: Outstanding: 0 ( 0.000%) >> -- >> Jun 28 22:10:33 sensor-test snort[4631]: Packet I/O Totals: >> Jun 28 22:10:33 sensor-test snort[4631]: Received: 4902895 >> Jun 28 22:10:33 sensor-test snort[4631]: Analyzed: 4902895 (100.000%) >> Jun 28 22:10:33 sensor-test snort[4631]: Dropped: 386163 ( 7.301%) >> Jun 28 22:10:33 sensor-test snort[4631]: Filtered: 0 ( 0.000%) >> Jun 28 22:10:33 sensor-test snort[4631]: Outstanding: 0 ( 0.000%) >> -- >> Jun 28 22:13:36 sensor-test snort[4631]: Packet I/O Totals: >> Jun 28 22:13:36 sensor-test snort[4631]: Received: 11162577 >> Jun 28 22:13:36 sensor-test snort[4631]: Analyzed: 11162577 (100.000%) >> Jun 28 22:13:36 sensor-test snort[4631]: Dropped: 1786910 ( 13.799%) >> Jun 28 22:13:36 sensor-test snort[4631]: Filtered: 0 ( 0.000%) >> Jun 28 22:13:36 sensor-test snort[4631]: Outstanding: 0 ( 0.000%) >> -- >> Jun 28 22:20:43 sensor-test snort[4631]: Packet I/O Totals: >> Jun 28 22:20:43 sensor-test snort[4631]: Received: 24907256 >> Jun 28 22:20:43 sensor-test snort[4631]: Analyzed: 24907256 (100.000%) >> Jun 28 22:20:43 sensor-test snort[4631]: Dropped: 5436551 ( 17.917%) >> Jun 28 22:20:43 sensor-test snort[4631]: Filtered: 0 ( 0.000%) >> Jun 28 22:20:43 sensor-test snort[4631]: Outstanding: 0 ( 0.000%) >> >> >> -- >> Jesse Bowling >> >> >> >> >> >> -- >> Jesse Bowling >> >> >> >> >> >> -- >> Jesse Bowling >> >> >> _______________________________________________ >> Ntop-misc mailing list >> Ntop-misc [at] listgateway >> http://listgateway.unipi.it/mailman/listinfo/ntop-misc > > > _______________________________________________ > Ntop-misc mailing list > Ntop-misc [at] listgateway > http://listgateway.unipi.it/mailman/listinfo/ntop-misc > > > > > -- > Jesse Bowling > > > > > > -- > Jesse Bowling > > > > > > -- > Jesse Bowling > > > _______________________________________________ > Ntop-misc mailing list > Ntop-misc [at] listgateway > http://listgateway.unipi.it/mailman/listinfo/ntop-misc
|