I had the chance to see a very interesting presentation last week about DNS tunnels. I understand the concepts well enough, but I had never run into one "in the wild." I realized, however, that I probably wouldn't have noticed one in the first place, so I decided to try to do a little digging to see if I could come up with a detection mechanism.
Fortunately, I'm using Sguil to collect all my Network Security Monitoring (NSM) data. As you may know, Sguil collects (at least) three types of data:
- Snort IDS alerts
- Network session data (from SANCP)
- Full content packet captures (from a 2nd instance of Snort)
The packet captures are not typically used for alerting purposes, leaving me with the possibility of using either an IDS signature based approach, or something using traffic analysis on the network sessions.
Some DNS tunnel detection signatures already exist (The Sourcefire community ruleset already has signatures to detect the NSTX tunnel software), but I think this approach is doomed to failure from the start. A signature-based approach would be good for detecting specific instances of DNS tunnelling software, but for real protection, a more general detection capability is required. I chose to try the traffic analysis approach instead.
I started with the assumption that there were basically three uses for DNS tunnels, to wit:
- Data exfiltration (sending data out of the LAN)
- Data infiltration (downloads to the local LAN)
- Two-way interactive traffic
Of course, the fourth possibility is that a tunnel could be used for some combination of the above, but I left that out of the scope for now. I also ignored the possibility of two-way traffic for now, since it's harder and I'm just starting to look into this. My main goal was to identify data exfiltration by looking at "lopsided" transfers. By extension, the same technique could also be used to discover data infiltration, as I'll show.
Once I identified the types of tunnels I wanted to detect, I tried to deduce what their traffic profiles would look like. Specifically, I assumed the following:
- At least one side of the session would be associated with port 53.
- The tunnel could use either TCP or UDP (though UDP is the most likely), but since SANCP is able to construct psuedo-sessions from sessionless UDP traffic, I could effectively ignore the difference between TCP and UDP sessions.
- The upload/download ratio would be lopsided. That is, the tunnel client would either be sending more data than they recieve (exfiltration) or vice versa (infiltration).
If you're not familiar with Sguil, just know that all the network session data is stored in a MySQL database. Although you normally access this data through the Sguil GUI, it's also easy to connect with the normal MySQL client application and send arbitrary SQL queries. I
started with this one:
select start_time, end_time, INET_NTOA(src_ip), src_port,
INET_NTOA(dst_ip), dst_port, ip_proto, (src_bytes / dst_bytes) as
ratio from sancp where dst_port = 53 and start_time between
DATE_SUB(CURDATE(), INTERVAL 7 DAY) and CURDATE()
having ratio >= 2 order by ratio DESC LIMIT 25;
The intent of this query is to identify at most 25 DNS sessions during the last week where the initiator (the client) sends at least twice as much data as it receives in response. Sounds good, but it is actually a bit naive. It turns out that many small transactions fit this profile, especially if the DNS server doesn't respond (the response size is 0 bytes).
I fixed this problem by also looking for a minimum number of bytes transferred (in either direction). After all, the purpose of a DNS tunnel is to transfer data, so if there are only a few bytes, the odds are very, very good that it's not a tunnel. In this version, I've set up the query also to check that the total number of source and
destination bytes is at least 50,000:
select start_time, end_time, INET_NTOA(src_ip), src_port,
INET_NTOA(dst_ip), dst_port, ip_proto, (src_bytes + dst_bytes) as
total, (src_bytes / dst_bytes) as ratio from sancp where dst_port =
53 and start_time between DATE_SUB(CURDATE(), INTERVAL 7 DAY) and
CURDATE() having ratio >= 2 and total > 50000 order by total DESC,
ratio DESC LIMIT 25;
This brings the number of false positives down quite a bit. If this still isn't good enough, you can tune both the ratio and the total bytes to give you whatever level of sensitivity you feel comfortable with. Increasing either number should decrease false positives, but at the expense of creating more false negatives (ie, you might miss some tunnels).
By the way, I mentioned before that you could use the same approach to detect either exfiltration or infiltration. The queries above are written for exfiltration, so if you would like to look for infiltration, simply change the definition of the "ratio" parameter from this:
(src_bytes / dst_bytes) as ratio
to this:
(dst_bytes / src_bytes) as ratio
That'll look for data going in the other direction. The rest of the query is identical for either case.
As I mentioned, I've never encountered a DNS tunnel outside of a lab environment (yet), so I can't say how well my approach holds up in the real world. If anyone with more experience would like to comment, I'm all ears. Similarly, if anyone decides to try out my method, I'd love to hear how it works out for you.
Update 2006-05-04 14:33: My fellow #snort-gui hanger-on and all-around smart guy
tranzorp has posted additional analysis of my ideas on
his blog. Specifically, he points to two easy ways to evade this simple type of analysis, and I recommend that you read his post for yourself, because he's got a lot of experience with this topic.
If you just want a quick 'bang-it-out-in-an-emergency' check for tunnels, the following updated query isn't bad. It's the same as above, but corrects for the fact that the DNS reply contains a complete copy of the DNS query, as tranzorp pointed out:
select start_time, end_time, INET_NTOA(src_ip), src_port,
INET_NTOA(dst_ip), dst_port, ip_proto,
(src_bytes + (dst_bytes - src_bytes)) as total,
(src_bytes / (dst_bytes - src_bytes)) as ratio
from sancp where dst_port = 53 and
start_time between DATE_SUB(CURDATE(), INTERVAL 7 DAY) and
CURDATE() having ratio >= 2 and total > 50000 order by total DESC,
ratio DESC LIMIT 25;
Based on feedback I've received from tranzorp and others, I'm looking at a more statistical approach to the problem. In short, I'm experimenting with computing a
psuedo-bandwidth for each session (bytes sent / session duration), then looking for abnormally high or low bandwidth sessions. I've got a prototype, but it's in perl so it's horribly inefficient for the mountain of DNS session data I collect. I'll post something about it when I have had a chance to tweak on it some more.