Showing posts with label Sguil. Show all posts
Showing posts with label Sguil. Show all posts

Friday, May 16, 2008

Alternative PCAP subsystems for Sguil

If you read my previous post on pcap indexing, you'll know that I've been playing around with some alternatives to the packet capture and retrieval subsystem in Sguil. I'm happy to announce that I've just committed two replacement subsystems to Sguil's CVS HEAD, one for daemonlogger and one for SANCP.

The daemonlogger subsystem should be fairly stable, as I've been running it in production for some time. It's basically a direct replacement for the snort packet logging instance. It's probably a bit more efficient, and has a smaller memory footprint, but it's still substantially similar.

The SANCP system, on the other hand, is very experimental. It uses the pcap indexing functions of SANCP 1.6.2C6 (and above) to dramatically speed up the retrieval of pcap data from huge captures. If your capture files are routinely over 2GB or 3GB, you might benefit from this. However, it does come at a cost, which is that the index files can consume 25% - 35% more disk space than the pcaps alone. Break out the RAID!

Of course, these are simply alternatives to the existing Snort-based packet logging system. That's not going anyway, we're simply offering choices for advanced users.

Also, even though I've been a member of the Sguil project for some time now, these are my first commits into the source tree. I'm officially a Sguil developer!

Thursday, April 03, 2008

PCAP Indexing

There's been some talk inside the Sguil project lately of improving the performance of PCAP data retrieval. That is, when you're trying to fetch the PCAP data for a single session out of a file that contains all the data your sensor saw during that time period, things can get pretty slow.

Our current method involves simply identifying which file probably (note the strategic use of that word) contains the data you're looking for, then using tcpdump with a BPF filter to find it. This usually works well, but it's often very slow, especially if the PCAP file you're searching through is large, say a few GB.

We've discussed a few approaches we could take to improve the performance of these retrievals. One promising way involves creating an index of the sessions inside each PCAP file. It turns out that the tool we're using to collect network session data, SANCP is actually a pretty decent packet logger, even though we're not using it in that manner. The newest 1.6.2 release candidate includes support for PCAP indexing, so I thought I'd take it for a spin and see just what kind of performance improvement we could expect, if any.

My good friend Geek00L recently blogged his experience with SANCP's indexing feature, but he didn't get into the performance, he just wrote about how it worked. You probably should read that, as well as the official docs, if you're interested in this subject.

Another thing I was interested in was figuring out how to use SANCP to index existing PCAP files, as Sguil is already capturing these with Snort. By default, SANCP will only index PCAP files that it creates, usually by sniffing the network interface. I prefer to keep my data collection processes separate if I can, and the existing PCAP collection is working well enough for now. So my first goal was to see if I could convince SANCP to create an index without creating an entirely new PCAP file.

It turns out that this is possible, though kinda kludgey. I was able to use the following ''sancp-pcapindex.conf'' file to create an index:


default index log
default pcap filename /dev/null
format index delimiter=| sancp_id,output_filename,\
start_pos,stop_pos,src_ip_dotted,\
dst_ip_dotted,ip_proto,src_port,dst_port

This is pretty close to the version listed in SANCP's docs, except that I added the ''default pcap filename /dev/null'' line in there. So SANCP will still create a PCAP file, but it'll be written to /dev/null so I'll never see it.

I also had to use two additional command-line options to turn off the "realtime" and "stats" output SANCP likes to generate by default. so in the final analysis, here's the command line I ended up using:

sancp -r snort.log.12347263 -c sancp-pcapindex.conf -d sancp-output -R -S


Ok, so on to the actual indexing tests! I was curious about several things:

  1. How long does it take to create an index?
  2. How large is the index, compared to the size of the PCAP file itself?
  3. Is there a performance increase by using the index, and if so, how much?
  4. Does extracting PCAP by index return different data than extracting it by tcpdump and a BPF?

I decided that I would choose two PCAP files of different sizes for my tests. One file was 2.9GB, and the other was 9.5GB. For each file, I tested index creation speed and retrieval speed using tcpdump compared to retrieval speed using the index and SANCP's ''getpcapfromsancpindex.pl'' tool. Each of these tests was conducted three times, and the results averaged to form the final result. In addition, I examined the size of the index file once (the last time), under the assumption that a properly-created index would be the same each time it was generated.























PCAP size Index size Index Creation Tcpdump extraction Indexed extraction
2.9GB 446M 7m58s 39s 5s
9.5GB 1300MB 23m20s 2m21s 11s

The larger file is roughly (very roughly) three times the size of the smaller file. As this data suggests, the index size and the index creation time are linear, as the larger of each value is about 3x the size of the smaller. The same is true of the time necessary to extract the data with Tcpdump, though not to quite the same extent (it's just a bit over 3x).

However, the interesting part is that the time required to extract the data using the indices is not linear. It only took a little over 2x as long, though to be honest, it could easily be a matter of the amount of data that was contained in the individual network session or something. Still, using the index was about 87% faster in the small file, and about 92% faster with the larger file.

I think it's pretty clear that these indices speed up PCAP retrieval substantially. I think the drawbacks are that the index files are fairly large, and that they take a long time to generate.

As for the index file size, the indices look like they are about 13% - 15% of the size of the original file. For the drastic performance improvement they provide, this could be worth it. What's one more drive in the RAID array? Also, it's possible that they could be compressed, with maybe only a relatively small impact on retrieval speed. I'll have to try that out.

Index generation time is potentially more serious. Obviously, it'd be nicer to generate the index at the same time the PCAP is originally written, but as I'm unwilling to do that (for the moment, at least), I think the obvious speed-up would be to somehow allow SANCP to generate the indices without trying to write a new PCAP file. Even when I've directed it to /dev/null, there has to be some performance overhead here, and any time spent writing PCAP we're throwing away is just time wasted. This would be my first choice for future work: make a good, quick index for an existing PCAP file.

All in all, I'm impressed with the retrieval speed of SANCP's indexed PCAP. Now, if we can get the index creation issue sorted out, this could be a really great addition to Sguil!

Update 2008-04-03 20:15: I was in such a rush to complete this post, that I accidentally forgot to answer my question #4! It turns out that the data returned by both extraction methods is exactly the same. Even the MD5 checksums of the PCAP files match. Great!

Also, note that I edited the above to add details on the command line I used to generate the indices. Another stupid rush mistake on my part. Sorry!

Monday, March 31, 2008

Switching to Sguil: A whole new meaning

Many of you may have wondered why I haven't yet blogged about the recent release of Sguil 0.7.0. Did I forget? No. Am I disappointed with it? Not at all! Am I just lazy? Yes, but that's not why.

The truth is, I've held off blogging about that because there's some even bigger news with the Sguil project!

You probably didn't know this, as we've tried hard to keep it under wraps until it could be formally announced, but the Sguil project has just received an extremely large vote of confidence, in the form of it being acquired lock, stock and barrel by Cisco!

Yes, you read that right! From the press release:

Under terms of the transaction, Cisco has acquired the Sguil™ project and related trademarks, as well as the copyrights held by the five principal members of the Sguil™ team, including project founder Robert "Bamm" Visscher. Cisco will assume control of the open source Sguil™ project including the Sguil.net domain, web site and web site content and the Sguil™ Sourceforge project page. In addition, the Sguil™ team will remain dedicated to the project as Cisco employees, continuing their management of the project on a day-to-day basis.

Really, I didn't blog about Sguil 0.7.0 yet because I didn't want to do say anything that could have interfered with this deal.

The great thing about this is that both Cisco and Sguil have made significant investments in Tcl, as it's already found in the OS on many Cisco products. Of course, Sguil is written almost entirely in Tcl, so this should provide for some great synergy going forward. You should start seeing Sguil being pushed out into the carrier-grade Cisco gear by 3Q08, with the rest of the Cisco-branded products following in phases through 4Q09. Linksys-branded gear will be supported too, though there's not an official timetable for that yet.

On a personal note, I would like to congraluate Bamm (AKA "qru"), Sguil's lead developer. He's put a lot of time into this project over the years, and is finally going to reap some rewards:

Although the financial details of the agreement have not been announced, Sguil™ developer Robert Visscher will become the new VP of Cisco Rapid Analysis Products for Security. “This deal means a lot to the Sguil™ project and to me personally,” Visscher explains. “Previously, we had to be content with simply being the best technical solution to enable intrusion analysts to collect and analyze large amounts of data in an extraordinarily efficient manner. But now, we’ll have the additional advantage of the world’s largest manufacturer of networking gear shoving it down their customers’ throats! We will no longer have to concern ourselves with mere technical excellence. Instead, I can worry more about which tropical island to visit next, and which flavor daiquiri to order. You know, the important things.”

I know that many of you will have questions about this major evolution in the Sguil project and our continuing roles as Cisco employees, so please feel free to leave them here as comments, or ask in freenode IRC's #snort-gui channel.

Tuesday, March 25, 2008

Temporarily speed up SANCP insertions in Sguil

It's Monday morning, you're half asleep, you haven't finished your first diet soda yet, and -- oh no! Sguild has been down all weekend! Worse yet, SANCP inserts are backed up, to the tune of 17,000+ files in the queue!

As you know, the Sguil sensors are pretty much independent of the actual Sguil server. They'll happily continue collecting data, even when sguild has been down for a while. When the server comes back up, the sensors will automagically reconnect and send all the queued data. This is by design, of course. You don't want to lose all that data due to a failure of the central server.

Even after an extended outage, most of the data collected by the Sguil sensors poses no real problem. There are relatively few Snort alerts (maybe a few thousand), and probably even fewer PADS events, and these get added to the database in no time. Network session records collected by SANCP, however, can pose a bigger problem.

If you recall, SANCP works by keeping an in-memory list of active network "sessions" (including psuedo-sessions created from UDP and ICMP traffic). By default, it will dump these to a file every minute or so (or more often, on a busy network). The Sguil sensor contains a SANCP agent process that monitors the filesystem for these files, and sends them to Sguild as they are created, deleting them from the sensor.

Now here's the problem: there are just so many darned network sessions on a busy network that even a short outage can result in a few hundred files waiting to be queued, especially if you have multiple sensors. Longer outages, though, can be disastrous. Let's say that you have six sensors, and your Sguil server has been down for the weekend (48 hours). How many files is that?

60 * 48 * 6 = 17,280

Now, at an average rate of about 5 seconds to insert each file, how many hours would that take to catch up?
17,280 / (60 * 60) = 24

That's right! It'd take a full 24 hours to catch up! In the meantime, you're missing a few days of valuable network data (probably the few days you're most likely to want to query on Monday morning) and your MySQL database is spending all it's time inserting, which means not only that it's slower to respond to your analyst console, but also slower to process incoming events. In fact, it can easily get caught in a sharp downward spiral, where the incoming data gets even further backed up.

So what can you do about this? Actually, it's quite simple. If you find that you're getting behind while processing your backlock of SANCP records, you can dramatically speed things up by temporarily disabling the indices on your SANCP files.

First, figure out which days you have to catch up on. If you know your server crashed on Friday the 8th, and it's now Monday the 11th, you probably want to go through all SANCP tables from Friday - Monday.

Second, determine what the table names will be. Remember that Sguil creates one SANCP table per day, per sensor. These are all merged into a single virtual table, but for indexing purposes, ignore that one and concentrate on the individual tables. They will be named something like:
sancp_$SENSORNAME_$DATE

So for example, if you have two sensors named "external" and "internal", you'd have the following tables:
sancp_external_20080208
sancp_internal_20080208

sancp_external_20080209
sancp_internal_20080209

sancp_external_20080210
sancp_internal_20080210

sancp_external_20080211
sancp_internal_20080211


Next, you simply issue the SQL command to disable indexing for each table:
ALTER TABLE sancp_external_20080208 DISABLE KEYS;

MySQL will perform a quick table check before returning to the prompt. This may take a minute, and I personally find it annoying to wait after each table, so I usually just create a text file with all the commands in it, one per line, and run it batch mode:
mysql -u sguil -p sguildb < DISABLE-KEYS.txt

Based on my experience, I've seen the indexing speed go from 5 seconds per file to about 5 files per second, which is quite significant! At that rate, it would take less than an hour to insert everything!
17,280 / (5 * 60 * 60) = 0.96

Of course, you have to be extra careful to re-enable indices on all those tables. You can run a similar set of SQL commands to turn indices back on for a table:
ALTER TABLE sancp_external_20080208 ENABLE KEYS;

Again, I usually run this as a batch job.

The act of disabling and then later re-enabling indices does take a little while, but usually not more than a few minutes for each. Even given this overhead, it is still significantly faster to process a bunch of SANCP files without indices, then reindex them after you're all caught up.

Sure wish I didn't need to know this... 8-)

Update 2008-03-25 11:27: After you re-enable keys, you may need to also do a quick db check to make everything sane again:
mysqlcheck -o -a -u sguil -p sguildb

This will recheck all your tables and make sure they're still consistent. I've had a few situations where Sguil has been returning error messages like "ERROR 1030 (HY000): Got error 124 from storage engine" until I did this.

Thursday, November 01, 2007

NSMWiki in print again

This month's ISSA Journal has another article by Russ McRee on NSM topics. You may remember that I've blogged about Russ' articles before. This time, he's writing about Argus, an excellent suite of tools for implementing distributed collection of network flow information. It also happens that he mentions NSMWiki, which I maintain for the Sguil project.

If you're not an ISSA member, you can read Russ' article here.

Thursday, October 25, 2007

Sguil training: is there demand?

Ok, so I know that there are a fair amount of Sguil users out there now, and more coming every day. Some of us have kicked around the idea of providing some Sguil training, though we never seemed to have the critical mass of potential students. Maybe the time has come to consider this again?

What I'd like to know is this: if there were a one day class that covered Sguil administration (installation, troubleshooting and maintenance) offered in the Hampton Roads, VA area (Norfolk, VA Beach, Williamsburg), do you think you would be interested in attending?

If you would be interested in attending, how much would you be willing to pay for such a class?

Would your answer change at all if the class also taught you how to actually analyze events and research incidents with Sguil, possibly as a second day?

I'm not promising to run a class, but I've been interested in doing so for quite some time, and if I get enough interest, it's certainly a possibility.

Please email me or leave a comment here with your thoughts. Thanks!

Tuesday, October 02, 2007

Sguil covered in Information Security Magazine

Richard Bejtlich points out that the October issue of Information Security Magazine has an article by Russ McRee, entitled Putting Snort to Work. The article is about Knoppix-NSM, a Linux LiveCD designed for easy monitoring. Knoppix-NSM includes a preconfigured Sguil server and sensor, and Russ has a lot of nice things to say about it.

It's really good to see Sguil in some mainstream security press. VictorJ's modsec2sguil custom agent and our very own NSMWiki even get mentioned, so I know he's done some homework.

Tuesday, September 11, 2007

Find Storm with Sguil (and a little ingenuity)

My friend nr has started a new security blog. His first real post is about using Sguil to detect Storm worm infections. Welcome to the blog world, nr!

Thursday, September 06, 2007

Cloaking your investigative activities in Sguil

I've written before on disguising your outbound DNS queries. In short, even looking up an IP to find the hostname might alert an attacker that you're on to them if they control their own DNS server. I briefly described how to create a simple proxy DNS server that could send all your queries through a third-party, like OpenDNS, which should make it harder for the bad guys to figure out just who is checking them out.

I'm happy to say the new Sguil 0.7.0 client now incorporates third-party DNS server support natively. In sguil.conf, you simply set the EXT_DNS, EXT_DNS_SERVER and HOME_NET variables, like so:


# Configure optional external DNS here. An external DNS can be used as a
# way to prevent data leakage. Some users would prefer to use anonymous
# DNS as a way to keep potential malicious sources from knowing who is
# interested in their activities.
#
# Enable Ext DNS
set EXT_DNS 1
# Define the external nameserver to use. OpenDNS list 208.67.222.222 and 208.67.220.220
set EXT_DNS_SERVER 208.67.222.222
# Define a list of space separated networks (xxx.xxx.xxx.xxx/yy) that you want
# to use the OS's resolution for.
set HOME_NET "192.168.1.0/24 10.0.0.0/8"

In this example, I've configued Sguil to use one of the OpenDNS name servers (208.67.222.222) to look up all hosts except those on my local LAN (addresses in either the 192.168.1.0/24 or the 10.0.0.0/8 range).

So that takes care of DNS, but what about WHOIS? Ok, so maybe it's a bit less likely that the attacker has also compromised a WHOIS server, but less likely doesn't mean that it hasn't happened. It probably has, and it probably will in the future. Therefore, it's prudent to also try to disguise the source of your WHOIS lookups.

Sguil has always had the ability to call a user-supplied command to perform WHOIS operations, so here is a simple script you can use to proxy all of Sguil's WHOIS lookups through a third party (Geek Tools, in this case). This should work on any version of Sguil.

#!/bin/sh
#
# Simple script to proxy all whois requests through whois.geektools.com
# to help keep the bad guys from figuring out that we're onto them when
# Sguil looks up a record.
/usr/bin/whois -h whois.geektools.com $*

To use this, just set the WHOIS_PATH in your sguil.conf file, like so:

set WHOIS_PATH /home/sguil/bin/sguil-whois.sh

So now you have it. By implementing DNS and WHOIS proxies in Sguil, you can add an additional layer of protection against bad guys who may be monitoring their systems for signs that you have discovered their attacks.

Tuesday, July 10, 2007

Searching inside payload data

Almost all of my searches involve IPs and/or port numbers, and Sguil has a lot of built-in support for these types of database queries, making them very easy to deal with. Sometimes, though, you want to search on something a little more difficult.

This morning, for example, I had a specific URL that was used in some PHP injection attack attempts, and I wanted to find only those alerts that had that URL as part of their data payload.

Constructing a query for this is actually pretty easy, if you use the HEX() and the CONCAT() SQL functions. If you're using the GUI interface, you only have to construct the WHERE clause, so you can do something like the following:

WHERE start_time >= "2007-07-09" \
AND data.data_payload like \
CONCAT("%", HEX("my.url.or.some.other.string"), "%")

The main problem with this type of query is that the data_payload field is not indexed, so it results in a table scan. You really need to make sure you have some other criteria that is indexed. In this case, I used the date to restrict the number of rows to search, but you could use IPs or port numbers as well.

Tuesday, June 19, 2007

MySQL Database Tuning Tips

I came across a great article on MySQL performance tuning. It's got a few very practical tips for examining the database settings and tweaking them to achieve the best performance.

"What's this got to do with security", you ask? As you know, Sguil stores all of it's alert and network session data in a MySQL backend. If you monitor a bunch of gigabit links for any amount of time, you're going to amass a lot of data.

I try to keep a few months of session data online at any given time, and my database queries have always been kinda slow. I learned to live with it, but after reading this article, I decided to check a few things and see if I could improve my performance, even a little.

I started by examining the query cache to see how many of my database queries resulted in actual db searches, and how many were just quick returns of cached results.


mysql> show status like 'qcache%';
+-------------------------+-----------+
| Variable_name | Value |
+-------------------------+-----------+
| Qcache_free_blocks | 1 |
| Qcache_free_memory | 134206272 |
| Qcache_hits | 395 |
| Qcache_inserts | 248 |
| Qcache_lowmem_prunes | 0 |
| Qcache_not_cached | 75 |
| Qcache_queries_in_cache | 2 |
| Qcache_total_blocks | 6 |
+-------------------------+-----------+
8 rows in set (0.00 sec)

The cache miss rate is given by Qcache_inserts / Qcache_hits, so in my case 63% of queries resulted in database searches. The converse is that 37% of my queries were served up straight from the cache. Almost all of the queries are either from Sguil or from my nightly reports, so this was actually a much better rate than expected. Notice, though, that I had 127MB of unused cache memory. My server's been running for quite some time, so it seemed to me like I was wasting most of that. I had originally set the query_cache_size (in /etc/my.cnf) to be 128M. I decided to reduce this to only 28M, reclaiming 100M for other purposes.

Next I wanted to examine the number of open tables. Sguil uses one table per sensor per day to hold session data, and another five tables per sensor per day to hold alert data. That's a lot of tables! But how many are actually in use?

mysql> show status like 'open%tables';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Open_tables | 46 |
| Opened_tables | 0 |
+---------------+-------+
2 rows in set (0.00 sec)

So the good news is that not many tables are open at any given time. I actually tried to do several searches through alert and session data over the last few weeks, just to see if I could drive this number up. I could, but not significantly. I had set table_cache to be 5000, but that seemed quite high given this new information, so I set it down to 200. I'm not sure exactly how much (if any) memory I reclaimed that way, but saner settings are always better.

Finally, the big one: index effectiveness. I wanted to know how well Sguil was using the indices on the various tables.

mysql> show status like '%key_read%';
+-------------------+-----------+
| Variable_name | Value |
+-------------------+-----------+
| Key_read_requests | 205595231 |
| Key_reads | 4703977 |
+-------------------+-----------+
2 rows in set (0.00 sec)

By dividing Key_reads / Key_read_requests, determined that approximately 2% of all requests for index data resulted in a disk read. That seemed OK to me, but the article suggested that the target for that be only 0.1%. Given that, I figured that the best thing I could do would be to reallocate that 100M I reclaimed earlier as extra key space. I did this by changing the key_buffer setting, which is now at a whopping 484M.

Did all this optimization make any difference? I didn't do any database performance benchmarking either before or after, but my percieved response time while using Sguil is a lot faster. It still takes time to do these complex searches, of course, but at least now I can go back through four or six weeks of data without having to leave for lunch. It seems to have made a big difference for me, so why not try it out yourself? If you do, post your experiences below and let us know how it went!

Tuesday, June 12, 2007

I'M IN YR RRs, QUERYING YR HOSTS

Probably the single best thing I like about Sguil is the development philosophy. Most of the features are suggested by working intrusion analysts. Just the other day, we were chatting on IRC (freenode's #snort-gui) about DNS lookups. Specifically, I asked about having the Sguil client modified to give the analyst more control of DNS lookups. I am always concerned that when I ask the GUI to resolve an IP for me, I might be tipping off an attacker who is monitoring the DNS server for their zone. I mentioned that it'd be cool to have the ability to look up internal IPs in my own DNS servers but forward all external IPs to some third party (e.g., OpenDNS) in order to mask the true source of the queries.

Based on some of the discussion, I decided it'd be very handy to have a lightweight DNS proxy server that could use a few simple rules to determine which servers to query. I had some time this afternoon, so I hacked one up in Perl. It turns out that the same feature has now made it into the CVS version of Sguil, but I thought that my proxy would still be of use for other purposes. So without further ado, I hereby post it for your downloading pleasure.

You'll need to make a few edits to the script to tailor it to your own environment. First, set up the @LOCAL_NETS list to contain a list of regexps for your local network. All incoming queries will be compared against these expressions, and any that match will be considered "local" addresses. Queries that don't match anything in the list will be considered "public" queries. Note that reverse lookups require the "in-addr.arpa" address format. Here's a sample:


@LOCAL_NETS = ( "\.168\.192\.in-addr\.arpa",
"\.10\.in-addr\.arpa",
"\.mydomain\.com" );

You'll also need to set the $LOCAL_DNS and $PUBLIC_DNS strings to space delimited lists of DNS servers to use for each type of query, like so:

$LOCAL_DNS = qw("10.10.10.2 10.10.10.3"); # My DNS Servers
$PUBLIC_DNS = qw("208.67.222.222 208.67.220.220"); # OpenDNS Servers

Once you've done this, just run the script (probably in the background) and it will start listening for DNS requests on 127.0.0.1 port 53 UDP & TCP. Then configure your system's /etc/resolve.conf to use the local proxy and you're all set.

If you want to open it up a little and provide DNS proxy service for other systems on your network, you can do this by changing the LocalAddr parameter in the call to Nameserver->new(). Set the IP to your system's actual address and anyone who can connect to your port 53 will be able to resolve through you. I haven't really done any security testing on that, so it probably has some nasty remotely exploitable bugs. You have been warned!

Anyway, I hope you find it useful. It only does straight forward and reverse lookups (A and PTR records), so it will fail if you ask for any other types of requests. If these things are really necessary, though, let me know and I'll consider adding them. If you decide to use this at your site, please drop me a comment to let me know (if you can).

Wednesday, April 18, 2007

Generating Sguil Reports

I've been running Sguil for a few years now, and while it's great for interactive analyst use, one of it's main drawbacks is the lack of a sophisticated reporting tool. The database of alerts and session information is Sguil's biggest asset, and there is a lot of information lurking there, just waiting for the right query to come along and bring it to light. Sguil has a few rudimentary reports built in, but lacks the ability to create charts and graphs, to perform complex pre- or post-processing or to schedule reports to be generated and distributed automatically.

To be honest, many Sguil analysts feel the need for more sophisticated reporting. Paul Halliday's excellent Squert package fills part of this void, providing a nice LAMP platform for interactive reports based on Sguil alert information. I use it, and it's great for providing some on-the-fly exploration of my recent alerts.

I wanted something a bit more flexible, though. A reporting package that could access anything in the database, not just alerts, and allow users to generate and share their own reports with other Sguil analysts. I have recently been doing a lot of work with the BIRT package, an open source reporting platform built on Tomcat and Eclipse.

BIRT has a lot of nice features, including the ability to provide sophisticated charting and graphing. The report design files can be distributed to other analysts who can then load them into their own BIRT servers and start generating new types of reports. It even separates the reporting engine from the output format, so the same report can generate HTML, PDF, DOC or many other types of output. Best of all, you can totally automate the reporting process and just have them show up in your inbox each morning, ready for your perusal.

If this all sounds good to you, check out a sample report, then read my Sguil Reports with BIRT HOWTO for more information.

If you decide to try this, please post a comment. I'd love to hear your thoughts, experiences and suggestions.

Big thanks go to John Ward for getting me started with BIRT and helping me through some of the tricky parts.

Thursday, February 08, 2007

IP Tagging

I just read Godfadda's blog entry about a prototype system for tagging IP addresses in Splunk. His system analyzes IP addresses as they're about to be inserted into his log/event tracking system and cross references them with several databases in order to generate additional tags to provide additional context to the event.

Right now, his system deals with geography ("which branch office is this in?") and system role ("Is this an admin's workstation, a server or a public PC?"). I really like this idea, as it provides valuable context when evaluating events.

In fact, I've done a lot of thinking recently about how to add more context to NSM information, but it wasn't until I read this article that I realized that what I was looking for would probably best be implemented as a tagging system. What if Sguil were to incorporate tagging? Well, we'd first have to figure out what to tag. I'd like to be able to tag several types of objects:


  1. IP addresses
  2. Ports
  3. Events
  4. Session records
  5. Packet data for specific flows (probably would treat the pcap file and any generated transcript as a single taggable object)

As for the tags themselves, the system should automatically generate tags based on some criteria, just as in Godfadda's system. Maybe it would automatically tag everything in my exposed web server network as internet,webserver, for example, or maybe it could correlate my own IPs to an asset tracking system to identify their function and/or location.

But here's the part I think would be even more useful: I'd like to have the analyst be able to tag things on-the-fly, and later search on those tags to find related information. For example, if someone has broken into my web server, I could tag the original IDS alert(s) with an incident or case number (e.g., "#287973"). Perhaps this would also automatically tag the attacker's IP with the same number. As I continue to research the incident, I will probably perform SANCP searches, examine the full packet data and generate transcripts. I could tag each of the interesting events with the same ID. At the end of the investigation, I could just do a search for all objects tagged with #287973, order them by date & time, and presto! The technical portion of my report is almost written for me! This is quite similar to other forensic analysis tools (EnCase, for example) that allow you to "bookmark" interesting pieces of information and generate the report from the bookmark list.

To go a bit further, what if the same attacking IP came back six months after the above incident, this time with an FTP buffer overflow exploit? You might not remember the address as the origin of a previous incident, especially if you have a large operation and the original incident was logged by a different analyst. However, if the console says that the address was tagged as being part of a specific incident, you'll know right away to treat it with more suspicion that you might otherwise have done.

To be honest, these are just some of my first ideas on the power of tags; the real power could come as we consider more elaborate scenarios. What if you could tag any item more than once? Well, by associating multiple incident tags with an item, you just might uncover relationships that you didn't realize existed. It doesn't take much to imagine a scenario where you can build a chain of related tags that could imply association between two very different things, perhaps by creating a series of Kevin Bacon links between addresses and or events.

So, will any of this show up in Sguil? Probably not any time soon. Maybe if I can convince Bamm that I'm not insane, maybe it'll find its way onto a feature wish list. Or maybe another project or product will beat us to the punch (it is the era of Web 2.0 after all, and tagging is like breathing to some folks nowdays). But I do fantasize about it, and I live in hope.

Thursday, December 14, 2006

Sguil vs. BASE

This afternoon, someone asked me how I would categorize the differences between Sguil and BASE. I started with the standard response: "BASE is an alert browser, but Sguil encourages a more structured approach."

By the end of my reply, though, I found myself thinking about how to express this in a different way, something that emphasized the functionality of the two systems.

Here's what I came up with, excerpted from my own private email reply:

You can think of the process of intrusion analysis as formulating and
then trying to answer a series of questions. For example, one series
might be:


  1. Was this an actual attack?
  2. If so, was the attack successful?
  3. What other systems may also have been attacked?
  4. What activities did the intruder try to carry out?
  5. What other resources were they able to gain access to?
  6. How should we contain, eradicate and recover from the intrusion?

In this sequence, BASE does a great job of answering question #1. It may also have certain information about #3, but it probably wouldn't supply enough information to give good answers to questions #2, #4 or #5. By correlating the additional information sources, Sguil is often able to come up with very good answers to each of the first five questions. Of course, the more information you have at your disposal, the easier it will be to answer the most important question, #6.


Of course, I'd be very interested to hear from any BASE users who would like to either confirm or dispute my analysis. If that's you, leave a comment!

Tuesday, August 29, 2006

Network Security Monitoring Wiki Launched

In cooperation with the Sguil project, I've just launched a new wiki, dedicated to Sguil and all things NSM: NSMWiki.

If you're a Sguil user, an NSM/IDS analyst, or just interested in network security in general, you should find things of interest there. It's just a skeleton now, but the great thing about wikis is that you can help us flesh it out!

Friday, July 21, 2006

Extracting gzipped or Unix script files from pcap data

During an incident response, it's often handy to be able to examine the actual attack traffic, and if you're using Sguil, you probably have it handy. One common situation is that an intruder has transferred files to or from your network, and you'd really like to see what's in them.

There's a great tool for extracting arbitrary files from pcap dumps, tcpxtract. Similar to the way hard drive forensic tools look through bytes on the disk to find the "magic headers" at the beginning of various types of files, tcpxtract combs through pcaps looking for file types it knows, regardless of the transport protocol used to ship them over the network. When it finds them, it writes them out to individual files for manual analysis.

Tcpxtract is a great tool for NSM practicioners, and should be in everyone's standard kit. There are a few common types of files that it doesn't support, but you can easily fix this by simply editing the tcpxtract.conf file to add support for new types if you know their magic numbers.

My friend geek00L has already blogged about adding Windows PE executable support. Now I'm here to tell you how to add support for gzipped files and Unix script files like "#!/some/file" ("#!/bin/sh" or "#!/usr/bin/perl" for example).

Just add the following two lines to the end of tcpxtract.conf:


gzip(1000000, \x1f\x8b\x08);
script(1000000, \x23\x21\x2f);

A little anti-climactic after all that buildup, wasn't it? I've had some advice that the script detection is likely to throw lots of false positives in an SSL session, so maybe you should keep it commented out until you know there are script files in the session that you need to find.

Thursday, June 22, 2006

Extracting email attachements from pcap files

From time to time, I need to examine email attachements that may have been delivered to my users, usually to see exactly what type of malware was included. Now, I could call them up and ask them if they got the message, but there are several reasons I don't like to do that. For example, I hate playing telephone tag, and I don't really want to encourage them to open the message for any reason.

Since I use Sguil, I have another option. I can extract the attachement from the network data directly. Here's how.


  1. Capture the session of interest as a pcap file. If you're also using Sguil, you can probably just do a SANCP search to find the network session that contained the message as it was delivered to your SMTP server. Open it up in Ethereal, which causes the Sguil client to copy the pcap file to your analysis workstation. Close Ethereal now, because you won't be using it for this.
  2. Once you have the pcap file, use the tcpflow tool to extract an ASCII version of the conversation. This will create two files, one for the server side of the session, and one for the client side. Here's an example, where xxx.xxx.xxx.xxx is the sending host, and zzz.zzz.zzz.zzz is your mail server:

    % tcpflow -r xxx.xxx.xxx.xxx_yyyy_zzz.zzz.zzz.zzz_25-6.raw
    % ls
    xxx.xxx.xxx.xxx_yyyy_zzz.zzz.zzz.zzz_25-6.raw
    xxx.xxx.xxx.xxx.yyyy-zzz.zzz.zzz.zzz.00025
    zzz.zzz.zzz.zzz.0025-xxx.xxx.xxx.xxx.yyyy

    The file you need to concern yourself with now is the one that ends in .00025. That's the one that contains the email message sent by the client.
  3. Edit the *.00025 file in your favorite text editor. As it is now, it's a complete record of all the SMTP protocol events, and what you really want is just the data. The easiest thing is just to delete everything up to (but not including) the first line that starts with "Received:".
  4. Use the following perl command to read the mail file in via stdin and dump out the decoded MIME message. The script uses the MIME::Parser module (available from CPAN) to handle MIME decoding.

    % perl -e 'use MIME::Parser; \
    $parser = new MIME::Parser; \
    $parser->output_under("/var/tmp"); \
    $entity = $parser->parse(\*STDIN); \
    $entity->dump_skeleton;' < *25


    Content-type: multipart/mixed
    Effective-type: multipart/mixed
    Body-file: NONE
    Subject: Message could not be delivered
    Num-parts: 2
    --
    Content-type: text/plain
    Effective-type: text/plain
    Body-file: /var/tmp/msg-1150990039-2841-0/msg-2841-1.txt
    --
    Content-type: application/octet-stream
    Effective-type: application/octet-stream
    Body-file: /var/tmp/msg-1150990039-2841-0/letter.zip
    Recommended-filename: letter.zip
    --

    As you can see, this command creates a directory /var/tmp/msg-XXXXXXX-XXXX-0/ which contains files for each piece of the MIME multipart message, including the attachement (in this case, /var/tmp/msg-1150990039-2841-0/letter.zip).


Now that you've got the attachement, you can use your favorite reverse engineering tools on it to figure out exactly what you've got on your hands.

Thursday, June 15, 2006

Tracking your most active network protocols with Sguil

All you Sguil users out there might find this interesting. I wanted to figure out a way to track the top most active protocols flowing over my perimeter, where "active" in this case means "most bytes transferred either in or out of my network". This is actually a fairly simple database query, but I had a special requirement.

You see, I wanted to know not only the total number of bytes transferred, but I wanted to see it broken down by whether the traffic was entering my network or leaving my network. SANCP (Sguil's session collector) tracks bytes by source and destination, but that's not good enough. The source of one connection could be an Internet host, while the source of another connection could be a system on my own LAN. Just because SANCP records how many bytes the source host sent doesn't mean it knows whether they were entering or leaving my LAN.

Here's a SQL query that really can tell the difference:


select dst_port,
sum(to_mynet) as bytes_to_mynet,
sum(from_mynet) as bytes_from_mynet,
sum(to_mynet + from_mynet) as total_bytes
from
(
select dst_port,
src_bytes as from_mynet,
dst_bytes as to_mynet
from sancp where
(start_time between DATE_SUB(CURDATE(), INTERVAL 1 HOUR)
and CURDATE())
and (src_ip between INET_ATON("192.168.0.0") and
INET_ATON("192.168.255.255")) and
not (dst_ip between INET_ATON("192.168.0.0") and
INET_ATON("192.168.255.255"))
UNION ALL
select dst_port,
dst_bytes as from_mynet,
src_bytes as to_mynet
from sancp where
(start_time between DATE_SUB(CURDATE(), INTERVAL 1 HOUR)
and CURDATE())
and not (src_ip between INET_ATON("192.168.0.0") and
INET_ATON("192.168.255.255")) and
(dst_ip between INET_ATON("192.168.0.0") and
INET_ATON("192.168.255.255"))
) as test_table
group by dst_port
order by total_bytes desc
limit 5;


The trick here is to do two queries, each limiting itself to either sources on the Internet or sources on my LAN. I can then use the "SELECT" statements to order the columns to reflect the direction of the data flow. In this query, the second column will always be the number of bytes flowing out of my network, and the third column will always be the number of bytes coming into my network. Of course, the first column is the destination port number, which corresponds roughly to the network protocol used.

These two queries are joined by a "UNION ALL" statement to make them into one large result set. The SQL code above treats this as a temporary table, from which the outer SELECT statement grabs it's data and does the final work of summing both incoming and outgoing data flows and creating a neat table.

The output of this query will look something like the following table. Note that I ran it against some very odd sample data, so it doesn't show a full 10 rows. Also, I messed with the numbers because I was uneasy about posting my real data here. Still, you can see what the report looks like.

+----------+---------------+-----------------+-------------+
| dst_port | bytes_to_mynet| bytes_from_mynet| total_bytes |
+----------+---------------+-----------------+-------------+
| 80 | 1816 | 51101804400 | 51101806216 |
| 22 | 1816 | 51101412328 | 51101414144 |
| 443 | 1816 | 51096413496 | 51096415312 |
| 53 | 1804400 | 18093887 | 18274327 |
+----------+---------------+-----------------+-------------+


To use this in your own network, simply change the IP addresses above to represent the beginning and ending of your own address space. As written, the query will track only a single hour of data (from 23:00 to 00:00 of the current day), but you can play with the time specification to get other reporting periods.

Friday, May 19, 2006

Scan detection via network session records

I needed a quick, easy way to detect scanners on my LAN. Since I'm running Sguil, I have plenty of data about network sessions in a convenient MySQL database. I thought, why not use that?

My premise (cribbed from Snort's sfPortscan preprocessor) is that network scanning activity stands out due to it's unusually high number of failed connection attempts. In other words, most of the time the services being probed are not available, and this makes the scanner easier to spot. Here is a simple perl script I wrote that implements this type of scan detection by querying Sguil's SANCP session database.

To use it, you'll need to edit the script to modify the $HOME_NET variable. It's a snippet of SQL code, but it's very simple and documented in the comments. I'm looking for scanning activity on my own LAN, so the default is to set it to search only for activity with two local endpoints. If you are reasonably fluent in SQL, though, you can customize this to find other types of scans. There are a few other variables you can tweak (read the comments) but nothing terribly critical.

The script identifies two types of scanning activity. Portsweepers are systems that try a few different ports across a variety of addresses. For example, malware that looks around trying to find open web servers would fall into this category. On the other hand, portscanners are systems that try a lot of ports, usually on a relatively few systems. Attackers that are trying to enumerate services on a subnet would be a good example.

I haven't really tested this anywhere but a RHEL/CentOS 4 system. If you get it working on some other platform, or if you have any other comments/suggestions/improvements, please post them here.