New Weapon in War on Botnets
by George Lawton
Researchers at University of Illinois at Urbana-Champaign (UIUC) have developed a new technique for tracking stealthy botnets that use peer-to-peer (P2P) technology. BotGrep is an inference algorithm that uses graph analysis to detect botnets that hide from other security tools.
Some botnet implementations, such as Conficker and Storm, are fairly easy to identify because they generate a lot of spam or denial-of-service (DOS) traffic. But others like Zeus, a popular malware family often used in stealing banking data, preserve their stealth by sending far less information. "As long as the botnet uses P2P communications," said Nikita Borisov, an UIUC associate professor and coauthor of the BotGrep tool, "we are able to identify its existence even though it does not have the loud activities."
A botnet is "an army of compromised hosts under a common command and control," according to a landmark presentation to the North American Network Operators Group (http://aharp.ittns.northwestern.edu/talks/botnets.pdf). The first botnets used an HTTP server or the Internet Relay Chat (IRC) system for command and control (C&C). However, as security experts developed better techniques for disrupting these communication channels, hackers began using P2P networks to distribute cryptographically protected commands.
"Pinning down P2P botnets communication is hard for the industry right now as there is no central point of control to track," said Ivan Macalintal, manager of advanced threat research at security firm Trend Micro. "BotGrep is going to be a game changer when it comes to analyzing P2P botnets."
BotGrep can identify all the hosts in a P2P network by analyzing Internet traffic logs to find the patterns of communication between infected hosts. Other approaches have targeted control servers or analyzed the P2P traffic content. BotGrep merely looks at whether any given set of nodes communicate with each other.
Grep Meets Bot
BotGrep works on the IP transit log files of large ISPs. The tool's name is based on the old Unix grep utility that performs a global search for regular patterns in text files. "BotGrep also relies on global patterns of regular activity," said Borisov, "so the name fit."
These regular patterns are translated into a communications graph that maps all the hosts' packet exchanges. Stealthy botnets using P2P networks form more complex interconnection patterns than other types of communications. Using BotGrep to analyze the sum total of these patterns on real test data, the UIUC researchers were able to localize between 93 and 99 percent of P2P connected hosts with a false positive of less than 0.6 percent.
Other researchers have applied graph analysis to botnet and P2P detection, but their techniques depended on the communication contents, port number, packet size, or interarrival delays. The botnet could evade detection by simply encrypting traffic and randomly adjusting packet sizes and port numbers. In contrast, BotGrep simply looks at where packets originate and terminate without analyzing the actual traffic. This makes it more robust to changes in botnet design.
Working with several tier-1 ISPs, the researchers obtained the log data from a few hundred thousand hosts. They used this raw data to generate synthetic data for 30 million hosts for more comprehensive testing. The basic algorithm was able to quickly analyze the entire 30-million-host data set on a high-end PC. One test caught a 1,000-node P2P network.
The UIUC team is also working on a privacy-preserving version of the algorithm that would let them collect data from multiple ISPs in a way that protects individual consumers' data. However, these algorithms take about a thousand times more computational power to achieve the same result.
BotGrep can only determine a host's IP address. This can confound the botnet analysis because many hosts can share one IP address and a mobile host can connect from multiple IP addresses. This also makes it harder to remediate the problem because an ISP can’t be sure which machine is infected. Organizations with a dedicated network management team can better identify infected machines.
The Race Goes On
BotGrep can integrate with response tools, such as blacklists, to mitigate the botnet's impact once it's discovered. It only complements rather than replaces current detection and remediation tools.
Borisov said he believes large ISPs will eventually be able to incorporate BotGrep into a suite of security services for their customers, but he and his colleagues don't have specific commercialization plans at this time. Accurately determining whether a P2P network is an innocuous application or a malicious botnet remains an open problem. Because BotGrep algorithms can't distinguish an authorized P2P network from a stealthy botnet, other botnet tracking tools are needed to make this distinction.
"A false-positive error would be all-or-nothing — the whole P2P network, with thousands or even millions hosts would be flagged as malicious," Borisov explained. "Obviously, this is something you would want to prevent."
Other tools to identify afflicted hosts could include intrusion-detection systems that look for anomalous behavior or patterns of misuse. Honeynets use unsecured virtual machines as bait to attract botnet infections. Honeynet computers don't run legitimate P2P code, so any P2P activity from them is likely to be part of a botnet.
In the long run, however, malware developers are likely to find counter techniques for evading detection and control. "For every detection technique, there can be approaches that the botnet authors can use to counteract the detection technique," Borisov noted. "There is a constant arms race. Trying to see one or two steps ahead of the botnet authors is a challenge."
George Lawton is a freelance journalist currently based in Guernevilla, CA. You can reach him via his website http://glawton.com.