Wednesday, June 27, 2007

The Right Way to Establish a Culture of Security

After reading this article, my hat is off to Yahoo's Arturo Bejar. Not only does he have the worlds coolest job title ("Chief Paranoid Yahoo"), but he's taken some extremely creative measures to help build a pervasive culture of security at the Internet behemoth. I especially like the part about the t-shirts, since it not only gives people a reward to strive for, but they are also free advertising for the program. And the multiple tiers sounds like it would really spur some competition to get those coveted red shirts.

As someone who has been called a raving paranoiac himself, I really appreciated this approach. There are some great lessons in that article.

Tuesday, June 19, 2007

MySQL Database Tuning Tips

I came across a great article on MySQL performance tuning. It's got a few very practical tips for examining the database settings and tweaking them to achieve the best performance.

"What's this got to do with security", you ask? As you know, Sguil stores all of it's alert and network session data in a MySQL backend. If you monitor a bunch of gigabit links for any amount of time, you're going to amass a lot of data.

I try to keep a few months of session data online at any given time, and my database queries have always been kinda slow. I learned to live with it, but after reading this article, I decided to check a few things and see if I could improve my performance, even a little.

I started by examining the query cache to see how many of my database queries resulted in actual db searches, and how many were just quick returns of cached results.


mysql> show status like 'qcache%';
+-------------------------+-----------+
| Variable_name | Value |
+-------------------------+-----------+
| Qcache_free_blocks | 1 |
| Qcache_free_memory | 134206272 |
| Qcache_hits | 395 |
| Qcache_inserts | 248 |
| Qcache_lowmem_prunes | 0 |
| Qcache_not_cached | 75 |
| Qcache_queries_in_cache | 2 |
| Qcache_total_blocks | 6 |
+-------------------------+-----------+
8 rows in set (0.00 sec)

The cache miss rate is given by Qcache_inserts / Qcache_hits, so in my case 63% of queries resulted in database searches. The converse is that 37% of my queries were served up straight from the cache. Almost all of the queries are either from Sguil or from my nightly reports, so this was actually a much better rate than expected. Notice, though, that I had 127MB of unused cache memory. My server's been running for quite some time, so it seemed to me like I was wasting most of that. I had originally set the query_cache_size (in /etc/my.cnf) to be 128M. I decided to reduce this to only 28M, reclaiming 100M for other purposes.

Next I wanted to examine the number of open tables. Sguil uses one table per sensor per day to hold session data, and another five tables per sensor per day to hold alert data. That's a lot of tables! But how many are actually in use?

mysql> show status like 'open%tables';
+---------------+-------+
| Variable_name | Value |
+---------------+-------+
| Open_tables | 46 |
| Opened_tables | 0 |
+---------------+-------+
2 rows in set (0.00 sec)

So the good news is that not many tables are open at any given time. I actually tried to do several searches through alert and session data over the last few weeks, just to see if I could drive this number up. I could, but not significantly. I had set table_cache to be 5000, but that seemed quite high given this new information, so I set it down to 200. I'm not sure exactly how much (if any) memory I reclaimed that way, but saner settings are always better.

Finally, the big one: index effectiveness. I wanted to know how well Sguil was using the indices on the various tables.

mysql> show status like '%key_read%';
+-------------------+-----------+
| Variable_name | Value |
+-------------------+-----------+
| Key_read_requests | 205595231 |
| Key_reads | 4703977 |
+-------------------+-----------+
2 rows in set (0.00 sec)

By dividing Key_reads / Key_read_requests, determined that approximately 2% of all requests for index data resulted in a disk read. That seemed OK to me, but the article suggested that the target for that be only 0.1%. Given that, I figured that the best thing I could do would be to reallocate that 100M I reclaimed earlier as extra key space. I did this by changing the key_buffer setting, which is now at a whopping 484M.

Did all this optimization make any difference? I didn't do any database performance benchmarking either before or after, but my percieved response time while using Sguil is a lot faster. It still takes time to do these complex searches, of course, but at least now I can go back through four or six weeks of data without having to leave for lunch. It seems to have made a big difference for me, so why not try it out yourself? If you do, post your experiences below and let us know how it went!

Tuesday, June 12, 2007

I'M IN YR RRs, QUERYING YR HOSTS

Probably the single best thing I like about Sguil is the development philosophy. Most of the features are suggested by working intrusion analysts. Just the other day, we were chatting on IRC (freenode's #snort-gui) about DNS lookups. Specifically, I asked about having the Sguil client modified to give the analyst more control of DNS lookups. I am always concerned that when I ask the GUI to resolve an IP for me, I might be tipping off an attacker who is monitoring the DNS server for their zone. I mentioned that it'd be cool to have the ability to look up internal IPs in my own DNS servers but forward all external IPs to some third party (e.g., OpenDNS) in order to mask the true source of the queries.

Based on some of the discussion, I decided it'd be very handy to have a lightweight DNS proxy server that could use a few simple rules to determine which servers to query. I had some time this afternoon, so I hacked one up in Perl. It turns out that the same feature has now made it into the CVS version of Sguil, but I thought that my proxy would still be of use for other purposes. So without further ado, I hereby post it for your downloading pleasure.

You'll need to make a few edits to the script to tailor it to your own environment. First, set up the @LOCAL_NETS list to contain a list of regexps for your local network. All incoming queries will be compared against these expressions, and any that match will be considered "local" addresses. Queries that don't match anything in the list will be considered "public" queries. Note that reverse lookups require the "in-addr.arpa" address format. Here's a sample:


@LOCAL_NETS = ( "\.168\.192\.in-addr\.arpa",
"\.10\.in-addr\.arpa",
"\.mydomain\.com" );

You'll also need to set the $LOCAL_DNS and $PUBLIC_DNS strings to space delimited lists of DNS servers to use for each type of query, like so:

$LOCAL_DNS = qw("10.10.10.2 10.10.10.3"); # My DNS Servers
$PUBLIC_DNS = qw("208.67.222.222 208.67.220.220"); # OpenDNS Servers

Once you've done this, just run the script (probably in the background) and it will start listening for DNS requests on 127.0.0.1 port 53 UDP & TCP. Then configure your system's /etc/resolve.conf to use the local proxy and you're all set.

If you want to open it up a little and provide DNS proxy service for other systems on your network, you can do this by changing the LocalAddr parameter in the call to Nameserver->new(). Set the IP to your system's actual address and anyone who can connect to your port 53 will be able to resolve through you. I haven't really done any security testing on that, so it probably has some nasty remotely exploitable bugs. You have been warned!

Anyway, I hope you find it useful. It only does straight forward and reverse lookups (A and PTR records), so it will fail if you ask for any other types of requests. If these things are really necessary, though, let me know and I'll consider adding them. If you decide to use this at your site, please drop me a comment to let me know (if you can).