Authors: Lukáš Ďurfina, Jakub Křoustek, Peter Matula, Petr Zemek
At the end of 2013, a new worm that targets small Internet-enabled devices was discovered. The worm, called Linux.Darlloz, is capable of infecting a wide range of “Internet-of-things” devices, like routers, security cameras, and entertainment systems that are increasingly equipped with an Internet connection.
It is argued that we may be at the beginning of an era in which our home appliances may actually get more attention from malicious attackers than our personal computers. The reason might be that the software that these devices are running is becoming out of date, which makes them much easier to exploit than up-to-date versions of operating systems that run on our personal computers. After all, who would think that the software their TV runs needs to be periodically updated in order to make it less vulnerable to potential attackers?
As we will shortly see, the Linux.Darlloz worm is interesting also from another point of view. When it is executed on an infected device, it first checks if another malicious worm, Linux.Aidra, runs on that device, and if so, it removes the competing worm from the device. This kind of a war between malicious-software writers is not something that we see very often, and it is assumed that in the future, we may see more of these fights over the control of “Internet-of-things” devices.
In this article, we analyze both worms by using our retargetable decompiler. It is a tool that automatically reverse-engineers a binary program into a high-level code. In contrast to a disassembler, our decompiler has three major advantages:
- Platform independence. The decompiler supports reconstruction of binary programs in several file formats and architectures. Since we have samples of the worms for several architectures (MIPS, ARM, PowerPC, and Intel x86), we are able to analyze all of them to see whether their behavior differs.
- High-level structure of the generated code. The decompiler can produce code in the C language as well as in a Python-like language. The code uses structured constructs, such as functions, if/else-if/else statements and loops, which makes the code more readable than assembly code, produced by a disassembler.
- Automatic analyses and reconstructions. The decompiler utilizes many automatic analyses. In this way, we may reconstruct types, obtain string literals and place them into the code, find functions from standard libraries and more. Again, this makes the code more readable.
You can try decompilation yourself by using our web demo, but bear in mind that the capabilities of the web demo version are limited.
Overview of the Samples and Initial Analysis
We have analyzed over twenty samples of the Linux.Darlloz and Linux.Aidra worms. All the samples were in the Linux ELF file format, but they were built for different architectures. More specifically, we had samples of Linux.Darlloz for the MIPS, ARM, PowerPC, and Intel x86 architectures, and samples of Linux.Aidra for MIPS, ARM, and PowerPC. We have not seen a version of this worm for Intel x86. Moreover, we had samples of Linux.Aidra for the SuperH architecture, but we did not analyze them because the decompiler does not support this architecture at the moment.
Next, we list the MD5 sums of all the samples we have analyzed.
Some of the Linux.Aidra samples were compiled with debugging information, which the decompiler utilized to give functions and variables more meaningful names and types. Moreover, almost none of the samples were stripped, so we had available also symbols for functions. On the other hand, all the samples of Linux.Darlloz were stripped so no symbols or debugging information were available. We have only detected that the Linux.Darlloz binaries were built by using GCC 4.1.2.
Finally, none of the binaries were packed by a packer or protector.
Analysis of Linux.Aidra
This worm is unique due to the fact that its source code is freely available. From the words of its author, it is a mass-tool commanded by IRC that allows scanning and exploiting routers to make a botnet. In addition to this, one can perform attacks with TCP flood.
Due to the openness of its source code, a highly detailed analysis is possible to be done. However, during our analysis, we have found that the binary samples we had available differ with some respect to the provided source code, even though the samples report the same version as the source files: lightaidra 0×2012. This means that the worm was modified before it was put out in the wilderness. In what follows, we explicitly point out these differences.
When the worm is started, it performs the following actions.
- Calls user function deamonize(), which calls Linux function fork(). As the function name suggests, this makes the process running as a daemon.
- Writes its process identifier (PID) into file /var/run/.lightpid. This can be seen from the following piece of code that our decompiler generated:
This closely matches the original source code:
As we will see later, Linux.Darlloz utilizes this file to kill his enemy.
- If there already runs an instance of the worm on the infected system, it is killed and replaced with a new instance. This can be utilized when a new version of the worm intrudes the device.
- It tries to connect to IRC servers whose addresses are encoded in the binaries. In the original sources, the list of servers is encoded by a substitution cipher:
After decoding, we obtain the following list of addresses (the XXX parts were masked by us):
However, from analysis, we have discovered that the samples contained just hard-coded addresses. One of such address was 94.23.XXX.XXX:6667.
- The name of the IRC channel is hard-coded in the binaries (originally #chan, but some samples use different names, such as #drogs). By calling user function connect_to_irc(), it connects to an IRC server under a name generated by getrstr(). This name has always a fixed prefix, depending on the architecture. This prefix was originally [a], [m], [s], [p], and [x] for ARM, MIPS, SuperH, PowerPC, and other architectures, respectively. However, in the samples, this prefix is fixed (for example, [PrEd0ne] or [falcon]). Finally, after the prefix, there is a sequence of 10 random characters. Sometimes, a password is present (for example, SHTDDoS), but not in all samples.
- The most important function is irc_requests(). In there, a connection to the server is kept open by replying PONG to PING requests from the server. The commands to be performed by the worm are obtained by reading the channel topic (TOPIC) or receiving private messages (PRIVMSG). The commands are received in function pub_requests(), where the received string is parsed, and then a matching function is called. For example, upon receiving PRIVMSG:.exec, user function cmd_exec() is called with a string specifying the command to be executed. This function calls Linux function popen(). In this way, the attacker may execute any command on the infected device.
- Apart from waiting for commands, the worm tries to infect other devices. It does this by calling user function cmd_advscan(), which scans the given range of IP addresses. The scanning is done in a separate thread so the main process can wait for commands. More precisely, 128 threads are used for scanning.
- There are two possible ways of intruding other devices:
- By utilizing a vulnerability of D-link routers. When a live IP address is detected, the worm tries to connect to port 80 (http) and sends the following POST request, which exploits a vulnerability that is present in some D-link routers:
If the exploit succeeds, the router returns its configuration file in an XML format. The worm parses the file to obtain a password for the root user. This password is then used when connecting to the vulnerable device through the telnet service on port 23.
2. By using a name and login received from the IRC channel. In this case, it tries to connect to the received address directly, again by using the telnet service on port 23.
- The connection through telnet is done in user function cmd_advscan_join(). If the login data are correct, it downloads script getbinaries.sh by executing
where %s is substituted with the address of a remote server that hosts the script. This script then downloads binaries for all the supported architectures (MIPS, ARM, PowerPC, and SuperH) and executes them. One of them will eventually start. After that, the script erases itself to cover its tracks. Some samples have modified the above way of downloading the script, and execute the following commands instead:
- The remaining differences we have found are different master_host, master_password, or a different address in getextip(), where, for example, 94.23.XXX.XXX is used.
Analysis of Linux.Darlloz
The analysis of Linux.Darlloz was harder due to the fact that all the samples were stripped. This means that no symbols or debugging information were available. However, by analyzing system calls, we were able to reconstruct the names of several functions. Moreover, we have given more readable names to user-defined functions to make the results of the analysis more readable.
A simplified version of the worm’s call graph can be seen below. Nodes in orange are user functions and nodes in blue are Linux functions. In what follows, we describe all these functions in detail.
- In user function mask_as_httpd(), the worm tries to mask itself as httpd (an HTTP server) by calling Linux function prctl(). This can be seen from the following piece of code:
- prctl(PR_SET_NAME, (int32_t)”httpd”, 0, 0, 0);
- In user function remove_aidra(), the worm tries to detect Linux.Aidra, analyzed in the previous section, kill it, and prevent the device from being infected by Linux.Aidra again. This is done in the following several-step way. First, it loads modules for netfilter/iptables, which is a firewall typically used on Linux:
The subcommand uname -r returns the version of the operating system. Then, it configures the firewall to drop packets from TCP port 23, which is the telnet service:
This prevents Linux.Aidra and remote users from connecting to the compromised device. After that, it tries to kill telnetd to make sure no telnet access is possible:
and its competing worm, Linux.Aidra:.
Finally, it tries to erase many files by calling Linux function unlink():
We have found that several samples erase some additional files:
- User function exec_sh_cmd() simply executes the given command through /bin/sh (shell). In a greater detail, it calls Linux function fork(), and the execution of the command is performed by the created child process so the main process can continue.
- User function kill_aidra() works as follows. As a parameter, it takes a path to the file storing the PID of Linux.Aidra (see the analysis of Linux.Aidra). This is /var/run/.lightpid. It opens the file by calling user function sys_open(), reads the PID from the file by user function sys_read(), converts its textual representation into a number by C standard function strtol(), and calls Linux function kill() with the PID and SIGKILL as arguments. This kills the process.
- User function kill_process() takes a single parameter, which is the name of the process. Upon calling, the function traverses all directories in /proc, and tries to convert their names into numbers by calling strtol(). In /proc, the system keeps the PIDs of all existing processes. Thus, the converted number represents a PID. Then, for every such directory PID, it tries to read /proc/PID/stat, which holds information about the process. After that, it parses the data to see if the process matches the name that was given to kill_process(). If this is so, then it calls Linux function kill(), which kills the process.
- The worm then starts to listen on port 58455 (hard-coded into the binaries) by using Linux functions bind() and listen(). After that, it waits for orders on that port in an infinite loop. We have also found out that first, it tries to communicate with IP addresses from the range 117.201.XXX.1 – 117.201.XXX.254. All of this is done in user function wait_for_orders().
- The worm spreads in the following way. It generates random IP addresses excluding addresses from the these ranges:
When a valid range is generated, the worm tries to intrude the remote host on the generated IP address by the following means: via telnet (port 23) and by exploiting a PHP vulnerability. First, we describe how the telnet way works. Then, we elaborate on the PHP way.
- In user function try_telnet(), if the worm successfully accesses TCP port 23. It tries the following combinations of login/password to access the host:
When it succeeds, it executes the following commands:
The string \\x7F\\x45\\x4C\\x46 represents a magic number of the Linux ELF file format. This means that the created files are binary files in the ELF file format. In this way, it creates the following seven files on the remote host and executes the first five of them:
The source of the data that are copied through telnet are the data that were copied there from the previous remote host. This is the way the worm replicates. One of the programs will eventually start, depending on the architecture of the remote host. So, it has infected the remote host and may try to infect other hosts.
- When the above-mentioned telnet attempt fails, it tries to intrude the host by exploiting the php-cgi Information Disclosure Vulnerability (CVE-2012-1823, CVE-2012-2311, CVE-2012-2335, CVE-2012-2336) through a HTTP POST request. This works as follows. There is a five-element array of possible paths that are tried when performing these HTTP requests:
In user function try_php_exploit(), it tries to connect to the IP address and send a malicious POST request:
User function send_post_request() generates the following HTTP request:
and sends it to the server by utilizing user function sys_send(). After that, it waits for an answer and checks if the request was successfully performed:
In a greater detail, the URL after decoding is
The PHP code that is executed on the server if the request passes through is the following (we have indented it to make it more readable):
Briefly, the PHP script tries different ways of executing the three commands at the end of the script. The first command downloads a binary file from http://www.gpharma.co/x86 and stores it into /tmp/x86. Then, it makes that binary file executable (chmod +x /tmp/x86), and executes it (/tmp/x86). We see that contrary to the telnet attempt, in the PHP attempt, it downloads and executes only a binary file for the Intel x86 architecture.
- Finally, in the ARM and MIPS samples, before the process leaves main(), it executes the following code:
This is done by writing the code into file /var/run/z, making the file executable, and executing the file via /bin/sh -c /var/run/z. The effect of all this is that the worm restarts itself. This piece of code was missing in the x86 samples.