TekDefense Network Challenge 001 - Walkthrough

Sometime around mid-September (of the last year!) I was tipped off about a new network forensics challenge created by @TekDefense and published on his blog. I was all up for the challenge but I did not have much time back then. Finally, I managed to spend a few evenings just before the due date to perform my analysis of the provided PCAP and document my findings.

Warning: Spoilers ahead! If you did not take the challenge yet, consider going back and trying to solve it by yourself!

To my big surprise my write-up was awarded first place. @TekDefense posted it in this blog post. Make sure to also check out @CYINT_dude’s write-up which took second place. With all that I decided to write a short follow-up, presenting how I performed my analysis and how I came to final conclusions.

Before we start please remember that:

This walkthrough and my original solution are by no means complete and probably do not tell the whole story.
I am not going to present any novel analysis techniques here. I basically followed the bottom-up principle - started with the initial alert and built the story as I was going through related events.
As always - there is more than one way to skin a cat. I would be happy to learn about different or better ways to process challenge data, correlate events, analyze files, identify indicators and write detection rules.

Toolbox

First things first. Let’s go briefly through tools I used to analyze the PCAP, develop detection rules, create a timeline and final report:

Wireshark/tshark
FakeNet
Snort
YARA
Microsoft Excel
Mou
Usual command line utilities: sort, uniq, wc, cut, strings, file

In addition to above-mentioned, I used a locked-down virtual machine running Kali Linux to execute suspicious ELF binaries.

It is also important to mention online resources and OSINT tools that were crucial for me to get additional context or better understanding of files, malware and indicators I encountered during investigation:

Reconnaissance

Having all of my tools of trade handy I decided to load the PCAP into Wireshark and start from there. As the provided Snort signature was simple and only looked for two strings it was easy to find matching packets without a need to use Snort:

alert tcp any any -> any any (msg:"HFS [File Download]";flow:to_client,established; content:"HFS 2.";distance:0; content:"HFS_SID="; classtype:suspicious; sid:999999; rev:1;)

Wireshark - HFS packets

Wireshark found 13 matching packets, each belonging to different TCP session (based on different destination ports). The Snort signature seemed to be looking for the server version and part of HTTP cookie headers set by the server in HTTP response:

Wireshark - TCP stream 191

Quick Google search revealed that strings in HTTP headers are characteristic to HTTP File Server (HFS) - a server designed for file sharing. According to provided challenge scenario it was a Snort hit that alerted customer about (potentially) suspicious activity.

I started wondering why file transfer from a server running specific software (HFS) could be a (potential) indicator of compromise? Well, it did not take long until I came across articles from Antiy and MalwareMustDie describing how vulnerable HFS servers were being exploited in order to serve malware.

At this point I assumed that the server 104.236.210.97 belonged to the client and was a target of malicious activity.

Initial Analysis

As the provided PCAP file was roughly 56 megabytes I felt like I need to get a better understanding of what kind of traffic was actually captured there.

With the help of several tshark filters presented below I obtained some basic stats on network protocols, sessions and ports present in the PCAP. My initial goal was to at least skim through traffic for top protocols and sessions and look for anything suspicious. Just a brief look showed large number of SSH sessions and UDP packets destined to port 80 which seemed to be a little bit off, warranting further analysis.

Mapping of number of packets and associated protocols

tshark -n -r NetChallenge_Linux.pcap -T fields -e _ws.col.Protocol | sort | uniq -c | sort -nr | head -10
TCP
QUIC
SSHv2
ARP
SSH
DNS
ICMP
HTTP
NTP
UDP

Established TCP connections and associated TCP ports (server side)

tshark -n -r NetChallenge_Linux.pcap -Y "tcp.len == 0 and tcp.seq == 1 and tcp.ack == 1 and tcp.flags.ack == 1 and not tcp.flags.reset == 1" -T fields -e tcp.dstport | sort | uniq -c | sort -nr
22
5198
80
8080
443
30890

Top 10 destination ports for UDP packets

tshark -n -r NetChallenge_Linux.pcap -Y "udp" -T fields -e udp.dstport | sort | uniq -c | sort -nr | head -10
80
53
53413
123
5060
137
1900
5353
520
161

With such amount of traffic I needed a good way to document and represent network connection data in order to be able to correlate all suspicious events. I decided to use tshark to export important information to CSV files and then import them to Excel. This seemed to be the quickest and simplest way to organize the data I needed.

I started with HTTP and used following two commands to extract needed HTTP request and response data from the PCAP:

tshark -n -r NetChallenge_Linux.pcap -Y "http.request and not ssdp" -T fields -e _ws.col.Time -e ip.src -e ip.dst -e tcp.dstport -e http.host -e http.user_agent -e http.request.method -e http.request.uri -E header=y -E separator=, > http_requests.csv

tshark -n -r NetChallenge_Linux.pcap -Y "http.response and not ssdp" -T fields -e _ws.col.Time -e ip.src -e ip.dst -e tcp.srcport -e http.response.code  -e http.server -e http.content_type -e http.content_length -E header=y -E separator=, > http_responses.csv

It was not that hard to correlate and combine both outputs. As you can expect number of HTTP requests roughly matched number of HTTP responses so it was just a matter of a single copy and paste operation to get them together in a single Excel worksheet. As a result each entry in my timeline contained fields extracted from both HTTP requests and responses making it much more readable (at least for me!).

Excel - HTTP timeline

Having my HTTP timeline ready I started reviewing and marking entries with colors. At that point I still did not have a good understanding of intrusion but as some entries seemed to be more suspicious than others it was a good way to mark them for follow up.

Throughout my analysis I used three different colors to visually expose entries:

Red: Malicious activity.
Yellow: Neutral activity that can turn either side depending on further findings.
Green: Benign activity.

After looking at collected HTTP entries I concluded that:

All HTTP requests from 104.236.210.97 to 120.210.129.29 that matched the Snort rule indicated malicious activity.
All HTTP requests from 104.236.210.97 to mirrors.digitalocean.com and nyc2.mirrors.digitalocean.com seemed to be benign as both servers are known mirrors for Debian and Ubuntu packages.
Inbound HTTP requests from 104.236.59.209 to victim server 104.236.210.97 for nc.exe and back.pl seemed to be at least suspicious.
All requests for testproxy.php seemed to be a part of common open proxy scanning and I disregarded them as benign and not related to the investigated case.
I treated requests from Microsoft-owned IP addresses 40.78.146.128 and 104.209.188.207 as benign but potentially interesting. They seemed to come from Skype Preview service indicating that someone must have sent a link to hxxp://104.236.210.97/index.html.1 over Skype - maybe even attackers exchanged a link to newly compromised server?
Rest of captured HTTP requests seemed to be a “background noise” and they were not relevant for further investigation.

Excel - colored HTTP timeline

As I was going through subsequent flows I started adding information about different IP addresses in a separate tab - just to have a handy source of reference.

Excel - Hosts tab

Analysis of extracted files

Extracting files from the PCAP was not a particularly hard task. As all transfers I spotted were using HTTP I just used Wireshark’s Export Objects option:

Wireshark - Export HTTP Objects

I quickly got rid of irrelevant HTML files as most of them just represented 404 (e.g. testproxy.php) or 302 (e.g. from mirrors.digitalocean.com) HTTP responses. Just by looking at file names and their sources I had suspicions which ones will turn out to be malicious. I did not bother investigating any of .deb files as they all came from legitimate source. I also assumed that in-depth analysis of every file was not a goal of the challenge - though I still wanted to extract all relevant network and endpoint indicators. Due to lack of time I decided to rely on basic static analysis, OSINT research and only when needed - dynamic analysis.

In the first place I gathered MD5s of files of interest and used Automater to quickly query VirusTotal:

Except for the file or.bin (09b62916547477cc44121e39e1d6cc26), all queried files had detections from multiple AV products. I combined CSV output from the Automater to yet another tab in my timeline spreadsheet. I also added size, type and architecture (based on file output) columns:

Excel - Files tab

Below are my notes for the BillGates binaries and the or.bin script as I found them most relevant and interesting. I’m going to skip descriptions of other extracted files like nc.exe (Netcat) or back.pl (reverse shell Perl script) as cursory analysis immediately reveals what they are.

BillGates Malware

My goal here was just to confirm that all files detected as BillGates malware were in fact malicious. I also wanted to know how does network traffic generated by each ELF executable look like. I thought that identifying such traffic in the PCAP could give me new interesting leads.

After reading several awesome write-ups on BillGates from Akamai, MalwareMustDie and Novetta I knew what to look for in collected files.

Thankfully all files were not stripped so simply running strings on them revealed some interesting details. I also noticed that all ELF files were exactly 1223123 bytes long - it was yet another indicator that they belong to BillGates malware family.

Output of the ‘file’ command

16081:        ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, for GNU/Linux 2.2.5, not stripped
SYN:          ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, for GNU/Linux 2.2.5, not stripped
SYN_1902:     ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, for GNU/Linux 2.2.5, not stripped
Trustr:       ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, for GNU/Linux 2.2.5, not stripped
java.log:     ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, for GNU/Linux 2.2.5, not stripped
xmapp:        ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, for GNU/Linux 2.2.5, not stripped

All ELF files contained references to source code files that were almost identical to ones identified by Novetta and MalwareMustDie in their reports.

$ strings SYN | grep '\.cpp'
AmpResource.cpp
Attack.cpp
CmdMsg.cpp
ConfigDoing.cpp
DNSCache.cpp
ExChange.cpp
Global.cpp
Main.cpp
Manager.cpp
MiniHttpHelper.cpp
ProtocolUtil.cpp
ProvinceDns.cpp
StatBase.cpp
SysTool.cpp
ThreadAtk.cpp
ThreadClientStatus.cpp
ThreadConnection.cpp
ThreadDoFun.cpp
ThreadFakeDetect.cpp
ThreadHttpGet.cpp
ThreadKillChaos.cpp
ThreadLoopCmd.cpp
ThreadMonGates.cpp
ThreadRecycle.cpp
ThreadShell.cpp
ThreadShellRecycle.cpp
ThreadTask.cpp
ThreadTns.cpp
ThreadUpdate.cpp
UserAgent.cpp
AutoLock.cpp
FileOp.cpp
Ijduy.cpp
Iysd76.cpp
Log.cpp
Md5.cpp
Media.cpp
NetBase.cpp
ThreadCondition.cpp
Thread.cpp
ThreadMutex.cpp
Utility.cpp
WinDefSVC.cpp

The last file (a91261551c31a5d9eec87a8435d5d337) was a PE binary. DrWeb’s detection on VirusTotal claimed that it was BackDoor.Gates.8. I was not aware about Windows versions of BillGates malware but Stormshield’s blog post quickly got me back on the right track.

As described by Stormshield, the file contained multiple embedded PE binaries inside its resources section:

pestudio - resources section of a91261551c31a5d9eec87a8435d5d337

PDB paths in a91261551c31a5d9eec87a8435d5d337

$ strings winappes.exe  | grep pdb
\GatesInstall\Release\GatesInstall.pdb
\2003\i386\agony.pdb
\IECtrl\Release\IECtrl.pdb
e:\releases\winpcap_4_1_0_2001\winpcap\packetntx\driver\bin\amd64\npf.pdb
e:\releases\winpcap_4_1_0_2001\winpcap\packetNtx\Dll\Project\Release No NetMon\x64\Packet.pdb
\Gates\Release\Gates.pdb
\Gates\x64\Release\Gates.pdb

At that point I was confident that I can identify all these files as belonging to BillGates malware family in my report. The last thing I needed were network indicators.

I followed the below process for ELF files in order to obtain C&C address, protocols, and ports used for C&C communications:

I configured my Linux VM to use my other Windows VM running in the same isolated network segment as a DNS server.
I configured FakeNet to respond with IP address of my Windows VM for every observed DNS request.
I ran Wireshark and FakeNet on my Windows VM.
I executed sample and observed issued DNS requests.
I observed subsequent connection attempts and when needed I adjusted FakeNet’s configuration so it listened on specific port that malware was trying to connect.
I noted C&C addresses, used ports and saved captured traffic to PCAP file for further reference.

Below is sample analysis for the SYN binary (cd291abe2f5f9bc9bc63a189a68cac82):

DNS requests captured after executing the SYN binary (cd291abe2f5f9bc9bc63a189a68cac82)

# tcpdump -i eth0 -n 'port 53'
tcpdump: verbose output suppressed, use -v or -vv for full protocol decode
listening on eth0, link-type EN10MB (Ethernet), capture size 65535 bytes
25:14.794299 IP 172.16.250.133.41452 > 172.16.250.1.53: 27320+ A? top.t7ux.com. (30)
25:15.014225 IP 172.16.250.133.43559 > 172.16.250.1.53: 27576+ A? www.vnc8.com. (30)
25:20.799249 IP 172.16.250.133.48032 > 172.16.250.1.53: 28856+ A? top.t7ux.com. (30)
25:21.015454 IP 172.16.250.133.40558 > 172.16.250.1.53: 29112+ A? www.vnc8.com. (30)
25:26.798733 IP 172.16.250.133.53643 > 172.16.250.1.53: 30392+ A? top.t7ux.com. (30)
25:27.015546 IP 172.16.250.133.42158 > 172.16.250.1.53: 30648+ A? www.vnc8.com. (30)
25:32.799605 IP 172.16.250.133.33856 > 172.16.250.1.53: 31928+ A? top.t7ux.com. (30)
25:33.014514 IP 172.16.250.133.51408 > 172.16.250.1.53: 32184+ A? www.vnc8.com. (30)
25:38.799074 IP 172.16.250.133.53824 > 172.16.250.1.53: 33464+ A? top.t7ux.com. (30)
25:39.015045 IP 172.16.250.133.49985 > 172.16.250.1.53: 33720+ A? www.vnc8.com. (30)

FakeNet output for the SYN binary (cd291abe2f5f9bc9bc63a189a68cac82)

Sample network communication generated by the SYN binary (cd291abe2f5f9bc9bc63a189a68cac82) and captured by Wireshark

The process for Windows version of malware (a91261551c31a5d9eec87a8435d5d337) was much simpler as I just needed to execute it in my Windows VM and observe FakeNet’s output.

Next I updated my Excel spreadsheet with collected network indicators and proceeded to the next extracted file.

Excel - Exported files

or.bin

or.bin was an interesting file. Beginning of the file contained simple Bash script that read and extracted a tar.gz archive appended to the end of the script. When extracted it just started install binary:

$ head -11 or.bin
#!/bin/bash
line=`wc -l $0|awk '{print $1}'`
line=`expr $line - 10`
mkdir /tmp/.tmp123 -p && tail -n $line $0 |tar zx -C /tmp/.tmp123
rm -rf *.bin && cd /tmp/.tmp123
./install
ret=$?
#
#
#
exit $ret

The install file seemed to be a stripped 64-bit ELF binary. Interestingly, the archive contained also a file named ooz.tgz which was not a tar.gz archive as suggested by its extension. The file contained very specific header “Salted__” indicating that it was encrypted using OpenSSL.

$ file or.bin.tmp
or.bin.tmp: gzip compressed data, last modified: Tue Jun 23 05:44:34 2015, from Unix
$ tar -tvf or.bin.tmp
-rwx--x--x  0 root   root    12832 Jun 23  2015 install
-rw-r--r--  0 root   root  5672984 Jun 23  2015 ooz.tgz
$ file install
install: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 2.6.9, stripped
$ file ooz.tgz
ooz.tgz: data
$ xxd -l 16 ooz.tgz
00000000: 5361 6c74 6564 5f5f 2d91 d71e 0a4f 02f2  Salted__-....O..

It looked like I would need to analyze the install file to learn how to decrypt ooz.tgz. Unfortunately, after initial inspection I knew that it will not be that easy. Binary seemed to implement several anti-analysis techniques. All strings in the binary were obfuscated:

IDA Pro - Obfuscated strings

Basic anti-debugging was implemented by making one of the child processes attach to the main process using a ptrace() call, effectively preventing use of debuggers and tools like strace:

# strace -f ./install
(...)
[pid 38792] open("/proc/38791/as", O_RDWR|O_EXCL) = -1 ENOENT (No such file or directory)
[pid 38792] ptrace(PTRACE_ATTACH, 38791, 0, 0) = -1 EPERM (Operation not permitted)
[pid 38792] dup(2)                      = 0
[pid 38792] fcntl(0, F_GETFL)           = 0x8002 (flags O_RDWR|O_LARGEFILE)
[pid 38792] brk(0)                      = 0x1078000
[pid 38792] brk(0x1099000)              = 0x1099000
[pid 38792] fstat(0, {st_mode=S_IFCHR|0620, st_rdev=makedev(136, 0), ...}) = 0
[pid 38792] mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7f370a63a000
[pid 38792] lseek(0, 0, SEEK_CUR)       = -1 ESPIPE (Illegal seek)
[pid 38792] write(0, "./install: Operation not permitt"..., 35./install: Operation not permitted
) = 35
[pid 38792] close(0)                    = 0
[pid 38792] munmap(0x7f370a63a000, 4096) = 0
[pid 38792] kill(38791, SIGKILL)        = 0
[pid 38792] exit_group(0)               = ?
Process 38791 resumed
Process 38792 detached

When I placed the file in a separate directory (so it did not ‘see’ ooz.tgz) and executed it, I noticed some strange output - like it was trying to spawn system commands:

# ./install 
dd: opening `ooz.tgz': No such file or directory
error reading input file

gzip: stdin: unexpected end of file
tar: Child returned status 1
tar: Error is not recoverable: exiting now
tar (child): jack.tgz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
tar (child): openssl-1.0.0e.tar.gz: Cannot open: No such file or directory
tar (child): Error is not recoverable: exiting now
tar: Child returned status 2
tar: Error is not recoverable: exiting now
./install: 11: cd: can't cd to openssl-1.0.0e
(...)

If my suspicion was correct, the program was deobfuscating strings during runtime and then passing them as arguments to execvp() function (which was visible when I opened the binary in IDA). I needed a way to get insight into what exactly is passed to execvp() calls without actually attaching debugger to the process.

After short research I found snoopy which seemed to do exactly what I needed. After enabling Snoopy and running install binary again I found following entries in a log file:

snoopy[47478]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:./install]: ./install 
snoopy[47483]: [uid:0 sid:6645 tty: cwd:/tmp filename:/bin/tar]: tar zxf - 
snoopy[47482]: [uid:0 sid:6645 tty: cwd:/tmp filename:/usr/bin/openssl]: openssl des3 -d -k buWwe9ei2fiNIewOhiuDi 
snoopy[47481]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/bin/dd]: dd if=ooz.tgz 
snoopy[47486]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/bin/tar]: tar zxvf jack.tgz 
snoopy[47488]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/bin/rm]: rm -rf jack.tgz 
snoopy[47489]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/bin/rm]: rm -rf ooz.tgz 
snoopy[47490]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/bin/tar]: tar xzvf openssl-1.0.0e.tar.gz 
snoopy[47492]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:./config]: ./config --prefix=/usr/local/openssl 
snoopy[47493]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/usr/bin/make]: make 
snoopy[47494]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/usr/bin/make]: make install 
snoopy[47495]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/usr/local filename:/bin/ln]: ln -s openssl ssl 
snoopy[47496]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/usr/local filename:/sbin/ldconfig]: ldconfig 
snoopy[47497]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/usr/local filename:/usr/bin/ldd]: ldd /usr/local/openssl/bin/openssl 
snoopy[47499]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/bin/rm]: rm -rf openssl-1.0.0e* 
snoopy[47500]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/bin/tar]: tar xzvf zlib-1.2.3.tar.gz 
snoopy[47502]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/usr/bin/make]: make clean 
snoopy[47503]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/usr/bin/make]: make 
snoopy[47504]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/usr/local filename:/bin/rm]: rm -rf zlib-1.2.3* 
snoopy[47505]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/usr/local filename:/bin/tar]: tar zxvf openssh-5.9p1.tgz 
snoopy[47507]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/usr/local filename:./configure]: ./configure --prefix=/usr --sysconfdir=/etc/ssh 
snoopy[47508]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/usr/local filename:/usr/bin/make]: make 
snoopy[47509]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/usr/local filename:/etc/init.d/sshd]: /etc/init.d/sshd restart 
snoopy[47510]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/bin/rm]: rm -rf openssh* 
snoopy[47511]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/bin/rm]: rm -rf jack* 
snoopy[47512]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/bin/rm]: rm -rf install.sh 
snoopy[47513]: [uid:0 sid:6645 tty:/dev/pts/0 cwd:/tmp filename:/bin/rm]: rm -rf /tmp/.tmp123

Bingo! It looked like the binary was decrypting ooz.tgz file with the DES3 key buWwe9ei2fiNIewOhiuDi, decompressing the archive and then compiling OpenSSL and OpenSSH from resulting source code. That definitely looked suspicious!

$ openssl des3 -d -k buWwe9ei2fiNIewOhiuDi -in ooz.tgz -out ooz_decrypted.tgz
$ file ooz_decrypted.tgz 
ooz_decrypted.tgz: gzip compressed data, from Unix, last modified: Tue Jun 23 12:41:38 2015
$ tar -tvf ooz_decrypted.tgz 
-rw-r--r-- root/root   5671346 2015-06-23 11:52 jack.tgz

The decrypted file contained yet another archive jack.tgz which in turn contained source code archives for OpenSSL, OpenSSH and zlib.

$  file jack.tgz 
jack.tgz: gzip compressed data, from Unix, last modified: Tue Jun 23 11:52:50 2015
$ tar -tvf jack.tgz 
-rw-r--r-- root/root   1155501 2015-06-23 11:49 openssh-5.9p1.tgz
-rw-r--r-- root/root   4040229 2015-06-18 10:06 openssl-1.0.0e.tar.gz
-rw-r--r-- root/root    496597 2015-06-18 10:06 zlib-1.2.3.tar.gz

I assumed that the final goal of the install binary was to install a modified version of OpenSSH and proceeded to closer inspection of the OpenSSH archive.

The great thing about tar archives is that by default they preserve some metadata about the archived files, including file ownership and modification timestamp. I skimmed through output of ls -lR command and it did not took long to notice that small part of the files from the extracted archive openssh-5.9p1.tgz had different owner (root) and much later modification time than the rest:

$ ls -lR | grep root
-rw-r--r-- 1 root root     17944 Jun 23  2015 auth.c
-rw-r--r-- 1 root root     31711 Jun 23  2015 auth-pam.c
-rw-r--r-- 1 root root      6525 Jun 23  2015 auth-passwd.c
-rw-r--r-- 1 root root     11347 Jun 23  2015 canohost.c
-rw-r--r-- 1 root root      4088 Jun 23  2015 includes.h
-rw-r--r-- 1 root root     10035 Jun 23  2015 log.c
-rw-r--r-- 1 root root     54168 Jun 23  2015 servconf.c
-rw-r--r-- 1 root root      6623 Jun 23  2015 sshbd5.9p1.diff
-rw-r--r-- 1 root root     51056 Jun 23  2015 sshconnect2.c
-rw-r--r-- 1 root root      5291 Jun 23  2015 sshlogin.c
-rw-r--r-- 1 root root       172 Jun 23  2015 version.h

As far as I could tell all modifications were consistent with OpenSSH backdooring article presented in this e-zine.

$ grep -A4 secret_ok includes.h
int secret_ok;
FILE *f;
#define ILOG "/bin/.ilog"
#define OLOG "/bin/.olog"
#define SECRETPW "lihao023.."

I stopped analysis of the or.bin file at this stage. With the new lead I kept a mental note to check the PCAP for (suspicious) SSH connections later on.

DNS Analysis

My primary goal here was to check if PCAP contained any DNS queries for malware C&C domains identified earlier. I thought it would be a good indicator that malware was executed on a compromised server. Instead of checking each domain one by one I decided to export all DNS queries and responses from the PCAP and add them to my spreadsheet:

$ tshark  -n -r NetChallenge_Linux.pcap -Y "dns and not icmp" -T fields -e _ws.col.Time -e ip.src -e ip.dst -e dns.flags.response -e dns.qry.name -e dns.a  -E header=y -E separator=, > dns.csv

I was not really surprised when I saw that first DNS query recorded in the PCAP was for one of known domains top.t7ux.com:

Excel - DNS traffic

As I was able to easily filter my results I immediately knew that the domain resolved to two different IP addresses:

118.192.137.245 (until 2016-09-08 10:39:57Z)
222.174.168.234 (starting at 2016-09-08 10:18:13Z)

I noted the following timestamp: 2016-09-07 22:19:03Z as an approximate time when malware was executed on a compromised system. I did not have any hard proofs but it was a good start.

I also briefly reviewed other DNS queries sent by the compromised server, but I did not find anything else worth digging in. There was just this one strange query sent at 2016-09-07 23:53:44Z:

Wireshark - DNS query

Was it possibly an attacker and his fat fingers mistyping something like host -l in a console window?

C&C Traffic Analysis

Getting actual C&C traffic was easy as I already knew IP addresses, protocols and ports used by malware. I decided to export each C&C packet to be able to see any changes in beaconing pattern. Initially I filtered out all retransmitted packets for better visibility.

I used the following tshark options to export all C&C traffic to a CSV file:

tshark -n -r NetChallenge_Linux.pcap -Y "(ip.addr==118.192.137.245 or ip.addr==222.174.168.234) and not icmp and not tcp.analysis.retransmission" -T fields -e _ws.col.Time -e ip.src -e ip.dst -e tcp.srcport -e tcp.dstport -e tcp.len -e data -E header=y -E separator=, > gates.csv

Excel - BillGates traffic

When I was analyzing the beaconing pattern, I noticed that for the first ~12 hours malware sent 45 identical messages, each approximately 15 minutes apart from the previous one. Based on Akamai’s write-up I was able to extract following information from captured messages:

IP address of the infected machine: 0x68ecd261 (1760350817) => 104.236.210.97
DNS addresses: 0x68ecd261 (1760350817) => 104.236.210.97
Number of CPUs: 1
CPU MHz: 0x95f (2399)
Total memory: 0x1e8 (488)
Kernel name and version: Linux 4.4.0-36-generic
Malware version: 1:G2.40

Wireshark - BillGates packet

There were no responses from any of C&C servers until 2016-09-09 13:46:05Z when 222.174.168.234 sent 18 messages containing following data 0400000000000000 in one second intervals.

Wireshark - BillGates packets

But here is the problem - only by accident I noticed that there was some additional data exchanged between compromised system and C&C that I missed due to display filter I used to export data with tshark. Wireshark also did not show that data in the “Follow TCP Stream” windows as it was not able to correctly reconstruct entire conversation.

The exchanged data turned out be be crucial for further investigation. For every 0400000000000000 message sent by C&C there was a response packet from the compromised host containing what looked like an IP address:

Wireshark - BillGates packet

This message exchange resembled what @unixfreakjp named “3rd step” in his post on KernelMode.info. Nowhere in the PCAP did I find initial two steps of communication between compromised host and C&C (222.174.168.234).

Yet again I referenced Akamai’s write-up and I noticed that responses sent by compromised host to some degree mirrored initial command message sent by C&C (which was missing in the provided PCAP). Based on their analysis it looked like in this case the malware was instructed to perform a DoS attack against IP address 23.83.106.115 over UDP (value 0x20) port 80 (0x50). Nice, one more lead to check!

UDP Analysis

I jumped straight into checking if any suspicious UDP traffic was present in the provided packet capture. I used Wireshark and its “Statistics -> Conversations” menu:

Wireshark - UDP conversations

32038 UDP packets on port 80 sent from 104.236.210.97 towards 23.83.106.115? Well, that was kind of… expected (I also recalled 32082 QUIC packets listed by tshark in the Protocol summary. As the rest of UDP conversations seemed pretty standard I simply exported all metadata about UDP packets sent to the attacked host:

tshark  -n -r NetChallenge_Linux.pcap -Y "ip.addr==23.83.106.115" -T fields -e _ws.col.Time -e ip.src -e ip.dst -e udp.srcport -e udp.dstport -e udp.length  -E header=y -E separator=, > gates_dos.csv

The amount of packets and short interval between them was telling. Compromised host transferred approximately 32 megabytes of data in just half a second. All packets were sourced from UDP port 55198 and were between 965 and 989 bytes long (minus static 8 byte UDP header).

"Excel - DoS packets" "Excel - DoS packets

SSH Analysis

Although at this stage I had good overview of what happened, I was still missing one important piece of the puzzle - initial infection vector. Based on couple of writeups I knew that actors behind the BillGates botnets very often compromise Linux machines by using SSH and brute forcing root password.

Using my standard ‘per-packet’ tshark export format was not of much help in this case as I wanted to know length of each session and amount of exchanged data. My initial assumption was that by looking only at these values I’ll be able to tell which SSH session was successful (as in: user provided correct username and password and was granted access to console) and which was not (e.g. it was a failed brute-force attempt). I needed to know if there was any successful session established just before suspicious events started occurring on the compromised host or if there were any brute-force attempts.

I quickly tested two scenarios where I connected to my VPS over SSH and captured traffic for both successful logon and failed attempts (3 seems to be a default setting for OpenSSH). Getting a command line prompt needed approximately 8500 bytes to be exchanged between SSH client and server (in ~24 packets). Three consecutive (failed) login attempts generated approximately 6700 bytes (in ~26 packets). These were of course rough estimates and likely were dependent on specific configuration but at least they gave me some idea. I assumed that every SSH conversation with higher number of exchanged data and frames would be indicative of successful user login over SSH protocol.

I used the following command to list all TCP conversations in the challenge PCAP and then filtered out all that were not over port 22:

$ tshark -t ud -n -r NetChallenge_Linux.pcap -qz conv,tcp | grep -E "([0-9]{1,3}[\.:]){4}22\s" | head -10
101.128.129:51202       <-> 104.236.210.97:22             4046    679341    4900    399541    8946   1078882  2016-09-07 22:16:07     11158.7515
171.119.98:58968        <-> 104.236.210.97:22               97     23286     123     14526     220     37812  2016-09-08 01:22:09        16.5526
101.128.129:51201       <-> 104.236.210.97:22               39      6871      40      6137      79     13008  2016-09-07 22:14:48        72.1657
224.160.184:54269       <-> 104.236.210.97:22               22      3717      24      3075      46      6792  2016-09-08 02:19:11        14.0569
224.160.184:56710       <-> 104.236.210.97:22               22      3717      23      3009      45      6726  2016-09-08 02:18:43        14.4332
224.160.184:59887       <-> 104.236.210.97:22               22      3717      23      3009      45      6726  2016-09-08 02:19:31        15.9564
224.160.184:33751       <-> 104.236.210.97:22               22      3717      23      3009      45      6726  2016-09-08 02:20:15        14.3399
224.160.184:51739       <-> 104.236.210.97:22               22      3717      22      2943      44      6660  2016-09-08 02:19:47        13.5479
224.160.184:50516       <-> 104.236.210.97:22               22      3717      21      2877      43      6594  2016-09-08 02:18:58        13.3001
27.121.121:11067       <-> 104.236.210.97:22               18      3645      17      2561      35      6206  2016-09-08 01:56:19         6.5652

Based on lengths of sessions and amount of exchanged data I selected two SSH clients: 46.101.128.129 and 71.171.119.98. In case of 46.101.128.129 both SSH sessions started just before first HFS file download occurred (at 2016-09-07 22:16:16Z). Taking into account timing and lack of any other suspicious connections I assumed it was the attacker that successfully authenticated to the compromised host over SSH. My suspicion was that the initial session was a successful brute-force attempt, while the second session was used to deploy malware and adjust compromised host to attacker’s needs. Looking at short time between both sessions and also between subsequent events it was evident that whole process was at least semi-automated. As a side note I need to say that I would restrain from formulating such far-reaching conclusion if it was a real life scenario and I would definitely try to obtain additional evidence!

Excel - SSH sessions

The PCAP did not contain initial handshake for SSH connection from the IP address 71.171.119.98 and thus I was not able to tell when the session has started (prior or after attacker’s activity) and if the session was much longer than 16 seconds reported by Wireshark.

Rest of the SSH connections seemed to be unsuccessful brute force attempts. Most of them were characterized by the use of the libssh library by clients (visible in the initial SSH message from the client), short duration and low number of exchanged data.

Putting It Together

Having all the data and findings handy it was just a matter of drafting a final report with answers to challenge questions. As a final step I proceeded with creating a master timeline as a an ultimate source of reference. Not having much time left I did not bother with proper formatting or using any template - I simply thrown all entries into a new Excel sheet and sorted them by timestamp. The story of a breach was immediately apparent:

Excel - Final timeline

That is it! As mentioned in the beginning, the final write-up was posted by @TekDefense on his blog. You can also find timeline spreadsheet here.

dfir it!

responding to incidents with candied bacon