Thursday, June 19, 2008

Integrating domain reputation search into Firefox 3

This happens to me every day. I find a domain name somewhere, usually through my NSM work, and I wonder, "Is this domain known to be malicious?" Now, I don't personally know every domain on the Internet, but I've had some success using McAffee's SiteAdvisor. You feed it a domain name, and it'll tell you not only if it thinks it's suspicious, but also whether or not it offers any sort of downloads, what other sites it's most closely associated with, and what it's users have to say about it (if anything).

Pretty good stuff, but I'm so lazy. Opening a new tab and typing in the SiteAdvisor URL is just sooo hard! So I decided to add it to my list of search plugins, so I can use the integrated search bar instead. Here's how to do it.

  1. Find your searchplugins directory. For a typical Unix system, this is ~/.mozilla/firefox/XXXXXXXX.default/searchplugins (where the XXXXXXXX is a random string)
  2. Create a file in this directory called siteadvisor.xml with the contents below.
  3. Restart Firefox.


There you go! Three simple steps, and now "Siteadvisor" should be listed when you drop down the search menu.

<SearchPlugin xmlns="http://www.mozilla.org/2006/browser/search/" 
xmlns:os="http://a9.com/-/spec/opensearch/1.1/">
<os:ShortName>Siteadvisor</os:ShortName>
<os:Description>Search McAffee Siteadvisor</os:Description>
<os:InputEncoding>UTF-8</os:InputEncoding>
<os:Url type="text/html"
method="GET" template="http://siteadvisor.com/lookup/?q={searchTerms}">
</os:Url>
</SearchPlugin>


Now, the question of the day: What other sites do you use to easily check a domain's reputation? Leave a comment and let us know!

Monday, June 16, 2008

OSSEC Project Acquired

Congratulations to Daniel Cid, who's OSSEC project has just been acquired. Now that Daniel will be working on the project full time, I think we can look forward to some great things!

Wednesday, June 11, 2008

Unintentional hilarity

I subscribe to the Info Security News RSS feed, which is a pretty nice way to keep up with various goings on in the industry.

This morning, the top headline was:

Unencrypted AT&T laptop stolen, details of managers pay lost


I have to admit, I don't really feel too bad about the poor AT&T managers. However, the really funny part was the very next headline:

AT&T Launches Encryption Services to Help Businesses Secure E-Mail and Data


I can't make this stuff up, folks!

Monday, June 09, 2008

Tor server lists revisited

Way back in 2006, I posted about a way to list active Tor servers by querying the Tor directory. Since then, the Tor project has updated it's directory protocol, so that old method no longer works. Since I had someone ask me about it today, I thought this would be a great time to go ahead and update that post.

The principle is still basically the same:

  1. Identify an authoritative Tor server
  2. Connect to it via HTTP and ask for the router list
  3. Parse the list to get the info you want.

Here's an updated script you can use to dump the information about active routers. The output contains 5 columns, separated by pipe characters ('|'). The columns are :
server name|IP address|onion routing port| \
directory services port|last update timestamp

Now, the first two fields are fairly self-explanatory. The onion routing port (sometimes referred to as the OR port) carries the actual data in a Tor session. The directory services port carries directory traffic (the sort of thing this script does). Not all Tor routers offer directory services, so you will often see a 0 in this column. Finally, the last column simply shows the time the router last updated it's status in the directory.

Here's the script:
#!/usr/bin/perl
#
# Fetch the list of known Tor servers (from an existing Tor server) and
# display some of the basic info for each router.

use LWP::Simple;

# Hostname of an existing Tor router. We use one of the directory authorities
# since that's pretty much what they're for.
$INITIAL_TOR_SERVER = "128.31.0.34"; # peacetime/moria1/moria2
$DIR_PORT = 9031;

# Fetch the list of servers
$content = get("http://$INITIAL_TOR_SERVER:$DIR_PORT/tor/status/all");
@lines = split /\n/,$content;

foreach $router (@lines) {
if($router =~ m/^r\s+(\S+)\s+(\S+)\s+(\S+)\s+(\d{4}-\d{2}-\d{2} \d{2}:\d{2}:\d{2})\s+(\S+)\s+(\d+)\s+(\d+)$/) {
($name, $address, $or_port, $directory_port, $update_time) =
($1, $5, $6, $7, $4);
print "$name | $address | $or_port | $directory_port | $update_time\n";
}
}


Of course, there is much more information in the directory than this script shows. As a NSM analyist, I'm more concerned with IPs and port numbers, but if you poke around, you can also find what OS and Tor software versions are running, what capabilities the routers offer, their default exit policies, and other cool stuff. This is all left as an exercise for the reader. If you're interested, read the spec.

Friday, May 16, 2008

Alternative PCAP subsystems for Sguil

If you read my previous post on pcap indexing, you'll know that I've been playing around with some alternatives to the packet capture and retrieval subsystem in Sguil. I'm happy to announce that I've just committed two replacement subsystems to Sguil's CVS HEAD, one for daemonlogger and one for SANCP.

The daemonlogger subsystem should be fairly stable, as I've been running it in production for some time. It's basically a direct replacement for the snort packet logging instance. It's probably a bit more efficient, and has a smaller memory footprint, but it's still substantially similar.

The SANCP system, on the other hand, is very experimental. It uses the pcap indexing functions of SANCP 1.6.2C6 (and above) to dramatically speed up the retrieval of pcap data from huge captures. If your capture files are routinely over 2GB or 3GB, you might benefit from this. However, it does come at a cost, which is that the index files can consume 25% - 35% more disk space than the pcaps alone. Break out the RAID!

Of course, these are simply alternatives to the existing Snort-based packet logging system. That's not going anyway, we're simply offering choices for advanced users.

Also, even though I've been a member of the Sguil project for some time now, these are my first commits into the source tree. I'm officially a Sguil developer!

Thursday, April 03, 2008

PCAP Indexing

There's been some talk inside the Sguil project lately of improving the performance of PCAP data retrieval. That is, when you're trying to fetch the PCAP data for a single session out of a file that contains all the data your sensor saw during that time period, things can get pretty slow.

Our current method involves simply identifying which file probably (note the strategic use of that word) contains the data you're looking for, then using tcpdump with a BPF filter to find it. This usually works well, but it's often very slow, especially if the PCAP file you're searching through is large, say a few GB.

We've discussed a few approaches we could take to improve the performance of these retrievals. One promising way involves creating an index of the sessions inside each PCAP file. It turns out that the tool we're using to collect network session data, SANCP is actually a pretty decent packet logger, even though we're not using it in that manner. The newest 1.6.2 release candidate includes support for PCAP indexing, so I thought I'd take it for a spin and see just what kind of performance improvement we could expect, if any.

My good friend Geek00L recently blogged his experience with SANCP's indexing feature, but he didn't get into the performance, he just wrote about how it worked. You probably should read that, as well as the official docs, if you're interested in this subject.

Another thing I was interested in was figuring out how to use SANCP to index existing PCAP files, as Sguil is already capturing these with Snort. By default, SANCP will only index PCAP files that it creates, usually by sniffing the network interface. I prefer to keep my data collection processes separate if I can, and the existing PCAP collection is working well enough for now. So my first goal was to see if I could convince SANCP to create an index without creating an entirely new PCAP file.

It turns out that this is possible, though kinda kludgey. I was able to use the following ''sancp-pcapindex.conf'' file to create an index:


default index log
default pcap filename /dev/null
format index delimiter=| sancp_id,output_filename,\
start_pos,stop_pos,src_ip_dotted,\
dst_ip_dotted,ip_proto,src_port,dst_port

This is pretty close to the version listed in SANCP's docs, except that I added the ''default pcap filename /dev/null'' line in there. So SANCP will still create a PCAP file, but it'll be written to /dev/null so I'll never see it.

I also had to use two additional command-line options to turn off the "realtime" and "stats" output SANCP likes to generate by default. so in the final analysis, here's the command line I ended up using:

sancp -r snort.log.12347263 -c sancp-pcapindex.conf -d sancp-output -R -S


Ok, so on to the actual indexing tests! I was curious about several things:

  1. How long does it take to create an index?
  2. How large is the index, compared to the size of the PCAP file itself?
  3. Is there a performance increase by using the index, and if so, how much?
  4. Does extracting PCAP by index return different data than extracting it by tcpdump and a BPF?

I decided that I would choose two PCAP files of different sizes for my tests. One file was 2.9GB, and the other was 9.5GB. For each file, I tested index creation speed and retrieval speed using tcpdump compared to retrieval speed using the index and SANCP's ''getpcapfromsancpindex.pl'' tool. Each of these tests was conducted three times, and the results averaged to form the final result. In addition, I examined the size of the index file once (the last time), under the assumption that a properly-created index would be the same each time it was generated.























PCAP size Index size Index Creation Tcpdump extraction Indexed extraction
2.9GB 446M 7m58s 39s 5s
9.5GB 1300MB 23m20s 2m21s 11s

The larger file is roughly (very roughly) three times the size of the smaller file. As this data suggests, the index size and the index creation time are linear, as the larger of each value is about 3x the size of the smaller. The same is true of the time necessary to extract the data with Tcpdump, though not to quite the same extent (it's just a bit over 3x).

However, the interesting part is that the time required to extract the data using the indices is not linear. It only took a little over 2x as long, though to be honest, it could easily be a matter of the amount of data that was contained in the individual network session or something. Still, using the index was about 87% faster in the small file, and about 92% faster with the larger file.

I think it's pretty clear that these indices speed up PCAP retrieval substantially. I think the drawbacks are that the index files are fairly large, and that they take a long time to generate.

As for the index file size, the indices look like they are about 13% - 15% of the size of the original file. For the drastic performance improvement they provide, this could be worth it. What's one more drive in the RAID array? Also, it's possible that they could be compressed, with maybe only a relatively small impact on retrieval speed. I'll have to try that out.

Index generation time is potentially more serious. Obviously, it'd be nicer to generate the index at the same time the PCAP is originally written, but as I'm unwilling to do that (for the moment, at least), I think the obvious speed-up would be to somehow allow SANCP to generate the indices without trying to write a new PCAP file. Even when I've directed it to /dev/null, there has to be some performance overhead here, and any time spent writing PCAP we're throwing away is just time wasted. This would be my first choice for future work: make a good, quick index for an existing PCAP file.

All in all, I'm impressed with the retrieval speed of SANCP's indexed PCAP. Now, if we can get the index creation issue sorted out, this could be a really great addition to Sguil!

Update 2008-04-03 20:15: I was in such a rush to complete this post, that I accidentally forgot to answer my question #4! It turns out that the data returned by both extraction methods is exactly the same. Even the MD5 checksums of the PCAP files match. Great!

Also, note that I edited the above to add details on the command line I used to generate the indices. Another stupid rush mistake on my part. Sorry!

Wednesday, April 02, 2008

Automating malware analysis with Truman

Let me start by saying, I'm no malware analyst. I've done a little reversing with IDA Pro, but really only in class. However, during an incident investigation, I frequently come across an unknown Windows binary that is likely to be some sort of malware, and would really, really like to know what it does, and stat! I can do a few basic tests on the binary myself, like examining the strings for clues as to it's purpose, or maybe unpacking it if it's using some standard packer. For the most part, though, modern malware is too hard to get a good handle on quickly unless you're aces with a debugger.

A couple of years ago, though, Joe Stewart from LURQH (now SecureWorks) released an analysis framework just for folks like me. The Reusable Unknown Malware Analysis Net (TRUMAN) is a sandnet environment that allows the malware to run on a real Windows system attached to a protected network with a bunch of fake services, then collects data about the program's network traffic, files and registry entries created, and even captures RAM dumps of the infected system. The great thing about TRUMAN is that it not only makes it easy to collect this information, it automates most of the process of creating a secure baseline, analyzing changes against that, and restoring the baseline to the Windows system when it's all over.

The terrible thing about Truman though, is that it is quite a complex system, with a lot of moving parts. Which would be OK, except that it comes with almost no documentation. Seriously. None. There's an INSTALL file that gives a brief overview, but leaves out most of the important steps. Frankly, except for one Shmoocon presentation video, there's nothing on the Internet that really tells you what Truman is, how it works or how to go about installing and using it.

Until now!

I just added a TRUMAN page to the NSMWiki. This contains a lot of information, much more than would comfortably fit into a blog post. Most importantly, it contains a detailed step-by-step process for getting TRUMAN up and running with RHEL5 and Windows XP.

Using Truman, I can collect a substantial amount of information about how a suspicious binary acts when it runs, and do it in a matter of 20 or 30 minutes, rather than hours. Admittedly, it's not foolproof, but it should come in extremely handy next time I run across an unknown Trojan.