dfir it!

responding to incidents with candied bacon

Forensic Case Studies - Carving and Parsing Solaris WTMPX Files

A few weeks back I was analyzing a Solaris 10 (SPARC) raw partition image and was trying to determine from the wtmpx files who had logged into the system, from what/which remote IP addresses and when. To be more precise, I was tracking nagios account that was used to compromise this machine. The problem I encountered was that the file system was completely wiped out - all files were gone.

Fortunately, this was done at filesystem level with rm -rf / command.

1
2
3
4
5
6
7
8
9
10
11
12
elceef@cerebellum:~$ sudo mount -o loop,ro -t ufs dd_nj090240-var /mnt
elceef@cerebellum:~$ ll -R /mnt
/mnt:
total 6
drwxr-xr-x  3 root sys  1024 jul  1 02:43 ./
drwxr-xr-x 30 root root 4096 aug  4 09:49 ../
drwxr-xr-x  2 root sys   512 jun 28 13:25 run/

/mnt/run:
total 2
drwxr-xr-x 2 root sys  512 jun 28 13:25 ./
drwxr-xr-x 3 root sys 1024 jul  1 02:43 ../

This means the data should still be there. But how to recover it?

Solaris wtmpx file format

Solaris uses /var/adm/wtmpx file which is in some way similar to /var/log/wtmp from Linux but unfortunately is incompatible. Also this system is based on SPARC architecture which is big-endian so in contrast to Intel x86 (little-endian) the integers are stored in reverse order. This means we cannot use Linux native tools like last to parse contents of a wtmpx file from Solaris. In order to recover it we need to know the exact structure. The easiest way to understand the format is to look at the source code of programs that read and write to wtmpx files. Since the target system is Solaris, the format is very likely to be found in /usr/include/utmpx.h C include file.

Here is an excerpt from Solaris 10’s utmpx.h:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
struct timeval32
{
  int tv_sec, tv_usec;
};

struct futmpx {
  char    ut_user[32];      /* user login name */
  char    ut_id[4];         /* inittab id */
  char    ut_line[32];      /* device name (console, lnxx) */
  pid32_t ut_pid;           /* process id */
  int16_t ut_type;          /* type of entry */
  struct {
    int16_t e_termination;  /* process termination status */
    int16_t e_exit;         /* process exit status */
  } ut_exit;                /* exit status of a process */
  struct timeval32 ut_tv;   /* time entry was made */
  int32_t ut_session;       /* session ID, user for windowing */
  int32_t pad[5];           /* reserved for future use */
  int16_t ut_syslen;        /* significant length of ut_host */
  char    ut_host[257];     /* remote host name */
};

Data carving

Each wtmpx entry is exactly 372-byte long (aligned to 4 bytes!) and it starts with an username trimmed to 32 bytes. Based on this information we can create a pattern for scalpel - well known file carving utility. In this case, we want scalpel to scan for specified string of bytes (header) and then save 372 byte long chunks of data that follow the header. If you want to learn more about the configuration file syntax, I encourage you to review the manual page or the configuration file itself where you will find many examples.

1
wtmpx y 372 nagios\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x74\x73

Let’s run it on the partition image and see the results!

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
elceef@cerebellum:~$ scalpel -o scalpel_out/ -O dd_nj090240-var 
Scalpel version 1.60
Written by Golden G. Richard III, based on Foremost 0.69.

Opening target "/home/elceef/dd_nj090240-var"

Image file pass 1/2.
dd_nj090240-var: 100.0% |***************************************************************************************************|   20.0 GB    00:00 ETA
Allocating work queues...
Work queues allocation complete. Building carve lists...
Carve lists built.  Workload:
wtmpx with header 
"\x6e\x61\x67\x69\x6f\x73\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x00\x74\x73" and 
footer "" --> 8 files
Carving files from image.
Image file pass 2/2.
dd_nj090240-var: 100.0% |***************************************************************************************************|   20.0 GB    00:00 ETA
Processing of image file complete. Cleaning up...
Done.
Scalpel is done, files carved = 8, elapsed = 67 seconds.

After a minute the tool carved eight files out of the image.

1
2
3
4
5
6
7
8
9
10
11
12
13
elceef@cerebellum:~/scalpel_out$ ll
total 44
drwxrwxr-x 2 elceef elceef 4096 aug 16 05:39 ./
drwxrwxr-x 4 elceef elceef 4096 aug 16 05:36 ../
-rw-rw-r-- 1 elceef elceef  372 aug 16 05:39 00000000.wtmpx
-rw-rw-r-- 1 elceef elceef  372 aug 16 05:39 00000001.wtmpx
-rw-rw-r-- 1 elceef elceef  372 aug 16 05:39 00000002.wtmpx
-rw-rw-r-- 1 elceef elceef  372 aug 16 05:39 00000003.wtmpx
-rw-rw-r-- 1 elceef elceef  372 aug 16 05:39 00000004.wtmpx
-rw-rw-r-- 1 elceef elceef  372 aug 16 05:39 00000005.wtmpx
-rw-rw-r-- 1 elceef elceef  372 aug 16 05:39 00000006.wtmpx
-rw-rw-r-- 1 elceef elceef  372 aug 16 05:39 00000007.wtmpx
-rw-rw-r-- 1 elceef elceef  963 aug 16 05:39 audit.txt
1
2
3
4
5
6
7
8
9
10
11
12
elceef@cerebellum:~/scalpel_out$ hexdump -C 00000000.wtmpx 
00000000  6e 61 67 69 6f 73 00 00  00 00 00 00 00 00 00 00  |nagios..........|
00000010  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000020  74 73 2f 31 70 74 73 2f  31 00 00 00 00 00 00 00  |ts/1pts/1.......|
00000030  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000040  00 00 00 00 00 01 b9 00  00 07 00 00 00 00 00 00  |................|
00000050  55 95 ac bb 00 0a d1 5d  00 00 00 00 00 00 00 00  |U......]........|
00000060  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
00000070  00 0e 31 30 2e 32 30 30  2e 31 32 35 2e 31 34 32  |..10.200.125.142|
00000080  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
*
00000174

The entries look valid. We can easily spot account name, console and source IP address this session originated from. We miss other important piece of the puzzle: timestamp and event type. We need to write a parser that will allow us to extract detailed information of each event (entry) similar to how last command does.

Parsing

I created a quick and dirty python script that benefits mostly from struct module to handle binary data. This module has a function called unpack() especially designed to parse binary and structured data according to a given format. Format strings are used to specify the expected layout when unpacking data. They are build up from format characters which specify the type and size of data being unpacked. I strongly encourage you to review documention for struct module first in order to understand better the meaning of format characters.

It is worth mentioning that I had to use pad bytes in the format string in order to maintain proper alignment for the futmpx struct involved. Don’t be surprised if your calculations are not in accordance with sizeof(struct futmpx) - this is the way data structures are stored in the memory.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
#!/usr/bin/env python

import struct
import sys
import datetime

def type(x):
  return {
  0: 'EMPTY',
  1: 'RUN_LVL',
  2: 'BOOT_TIME',
  3: 'NEW_TIME',
  4: 'OLD_TIME',
  5: 'INIT_PROCESS',
  6: 'LOGIN_PROCESS',
  7: 'USER_PROCESS',
  8: 'DEAD_PROCESS',
  9: 'ACCOUNTING'
  }.get(x, 'UNKNOWN')

data = open(sys.argv[1], 'rb')
while True:
  chunk = data.read(372)
  if not chunk:
    break
  s = struct.Struct('>32s 4s 32s i H H H b b I I I 5I H 257s b')
  unpacked = s.unpack(chunk)
  #TODO: timezone
  timestamp = datetime.datetime.fromtimestamp(int(unpacked[9])).strftime('%Y-%m-%d %H:%M:%S')
  print(str(unpacked[0]) + '\t' + str(unpacked[3]) + '\t' + str(unpacked[2]) + '\t' + str(timestamp) + '\t' + str(unpacked[18]) + '\t' + type(unpacked[4]))

Now it’s time to see this code in action. My script takes only a single file as an argument so I use the following command line kung-fu to parse all files (in this case single wtmpx entries) at once and sort by the timestamp:

1
2
3
4
5
6
7
8
9
elceef@cerebellum:~$ for i in $(ls *.wtmpx); do readutmpx.py $i; done | tr -d '\000' | sort -k 4
nagios    257138  pts/1   2015-02-14 04:12:35 10.212.6.160    USER_PROCESS
nagios    257138  pts/1   2015-02-14 04:12:35 10.212.6.160    USER_PROCESS
nagios    257138  pts/1   2015-02-14 04:15:35     DEAD_PROCESS
nagios    257138  pts/1   2015-02-14 04:15:35     DEAD_PROCESS
nagios    112896  pts/1   2015-07-02 17:27:23 10.200.125.142  USER_PROCESS
nagios    112896  pts/1   2015-07-02 17:27:23 10.200.125.142  USER_PROCESS
nagios    112896  pts/1   2015-07-02 17:32:32     DEAD_PROCESS
nagios    112896  pts/1   2015-07-02 17:32:32     DEAD_PROCESS

Works like a charm! But there is still area for improvement. This code does not convert the time to the correct time zone. Take this into account before building a timeline.

Happy end

Solving this case would not be possible without this promising technique. The compromised system was configured to keep track of only unsuccessful authentication attempts leaving wtmpx records as the only reliable source of information about the origin of the attack. The person responsible for the destruction of this system was too confident - deleting all files is not enough to cover all tracks. Now personal details of this individual are known and the case is closed. Cheers!

Comments