APIs Archives – unsafehex

20,000 Leagues Under The Sand, part 4

When running malware in a virtual machine sandbox, proper management of the VM is imperative to prevent (unwanted!) contamination. You may already know that it is good practice to establish a clean state with a snapshot prior to running a potential nasty, so that you can simply restore it to get back to a known good state. It’s generally pretty intuitive to do this in a hypervisor’s GUI. It’s also pretty obvious how to run the malware you’re interested in – double click and hey presto, malware happens. But what do you do if you want all that to take place without you in the driving seat?

Hypervisor APIs

There are three core elements to automating a sandbox:

Controlling the guest VM’s state
Interacting with the guest
Capturing information from the guest

Fortunately, automating virtual machines is a requirement for far more than just the niche world of malware analysts. For nearly every function you could imagine, there is a means of controlling it with code instead of a GUI. You may have already noted that for my sandbox I chose QEMU/LibVirt, and one of the core reasons was the extensive resources for controlling it in the language I am most comfortable with, Python. If you are more partial to other languages, you can also choose from C, C++, C#, Go, Java, OCaml, Perl, PHP and Ruby.

Other hypervisors also have decent APIs; VirtualBox supports C++, SOAP (yuck), Java, and Python. Hyper-V is (naturally) controlled with Powershell. And so on and so forth.

Hypervisor APIs are primarily designed around the first of the three core elements (control), though there are some aspects for interaction and information capture available also. So to begin with the VM state, let us consider what control we might need. Since we want to make sure our results are relevant to the particular malware we have selected, we must be able to place the VM into a clean state. It is also sensible to only have the VM active when we are actually using it, so pausing/unpausing is also desirable (a cold boot might work, but you would either have to devise a means of logging in, or configure the VM for automatic login; plus it wastes time). These options are both possible through the LibVirt APIs.

Guest interaction

Two items involving guest interaction are essential to automate the testing of malware:

Deliver the sample to the guest
Execute the sample

You must transfer the sample to the guest’s file system. This can either be done from the host, or from the guest. It is theoretically possible to write directly to the filesystem, though this is strongly advised against for running VMs as it can cause corruption. Exposing a share with write permissions to the host is another option. The reverse can be done from host to guest (also not recommended). In my case I have chosen to cause the guest to download the file from a HTTP server exposed on the host’s virtual network interface. This is done with a small service running on the guest¹.

Running the sample can be done in a few ways. One that I experimented with was via a command:

cmd /c start C:\Users\<user>\Desktop\malware.exe

This should cause the file to be started with its default program and parameters. However, my results with this method were extremely unreliable, particularly with Java .jar files. It may have been possible to find out what was breaking things and fix it, but after a few weeks I was just tired of it and decided to try something else. What I wanted instead was for something that I could guarantee would work without fail. Enter VNC.

VNC is a protocol for remotely interacting with the graphical interface of a system. LibVirt comes with VNC as one of the options for interacting with guests; and handily there is a python library with which you can control VNC. Using this allowed me to send mouse movements and clicks, launching the file just as a user would. I should note here that the default protocol for interacting in LibVirt, Spice, is also capable of automation with python; however all of the resources I was finding when starting out helped me to get VNC working and I have not investigated the alternative at this point.

What we are doing here is not just executing the malware – we are simulating a user interacting with the system. This is important, because there is plenty of malware around that pays attention to what the user input is doing and will decide not to play ball if, for example, the mouse is not moving. I have also seen examples in which the malware will check for noticeable changes in the display and hide if it does not change – so just clicking empty bits of desktop is not going to help. Other samples might only become active if you visit the website of a bank (or any site the author is interested in – but mainly I have heard this in relation to banking malware). Capturing the activity of malware that does these things make simulating a variety of actions important.

Python code to interact with VM using VNC

VNC interaction in python

When simulating activity it is important to be aware of the limitations. If you are driving a sandbox, looking at a screen, you can react to what you see and adapt your actions. If a program has not finished running or a website has not loaded, you know to wait. You know what part of the screen is a login button for you to click, you know if there is a pop-up message that you have to approve or deny before progressing. A script controlling a VNC mouse and keyboard – unless you do some extraordinarily ambitious work with image recognition – has no concept of these things; you must carefully tailor and test your scripted actions to take account of them. Even having considered these things, my sandbox sometimes has problems; I believe some of the time this is down to hardware resource limitations – although I have programmed pauses at moments I expect something to be loading, if something else on my host decides it needs CPU time and slows everything down, the pause I’ve created might not be enough. This is just one of the possible reasons but hopefully it illustrates that the issues can strike from unexpected directions.

I hope this has been informative; the next post discusses automatic collection of artifacts and evidence from the malware you have just executed.

20,000 Leagues Under The Sand – Part 2

read part 1

As a newbie sandboxer, the biggest obstacle for me was finding a way of getting in-depth information on what actions were being performed by malware I wanted to test. In particular, I wanted to be able to drop some samples, go away and make lunch, then come back and be looking at some results. That meant stepping through it in a debugger was out, or at least a lesson for another day. You’ve probably already seen that I ended up using Sysmon, but let’s have a look at the alternatives for a moment.

Built in Windows logging

Windows 10 and Server 2016 have the option of enabling process audit logging. This will capture the command line that caused a process to be launched which is a very useful bit of information.
Registry auditing is possible, and can be managed by group policy or manual editing of the registry.

Filesystem forensics

The files in C:\Windows\Prefetch\ can show if executables were run
The AppCompatCache registry key and AmCache.hve hive contain more detailed information on program execution, though neither logs individual execution instances or command line options
You can diff the filesystem – have a clean copy, either of the Master File Table or of the entire structure – and compare to see what’s changed; this is a fairly intensive operation, especially if you intend to see if a known good file has been replaced with a malicious version
There are tools for parsing registry hives so identifying new/modified keys is possible

Creating your own API call logging

If you’re a good enough programmer to write code that logs API calls, this is the gold standard. I am not (yet) up to this. It is possible to monitor for most of the interesting events such as process and file creation, registry modification etc. using filter drivers. If you want to go a step further and monitor (or even intercept and change) system calls, you need to be looking at DLL injection. This is the method used by Cuckoo sandbox, among many others.

Building monitoring in to the virtualisation

Technically this is all just code simulating hardware running other code. If you’re smart enough to modify a hypervisor so that it can recognise and log API calls within its guests, go for it. Please excuse me for thinking you’re a bit mad though!

Options #1, #2 and #4 hold an additional advantage of being difficult or impossible for sandbox evasion techniques to pick up on.

And then we get to Sysmon, which is in effect a version of #3, but it has a big advantage: somebody else did all the work for us! Hooray for Mark Russinovich and Thomas Garnier. Many sandboxes do API call monitoring; sometimes it can be a little bit excessively detailed (hello Cuckoo) but as far as understanding what malware is doing goes, it’s the bee’s knees. Let’s have a look at what you can get out of it.

Sysmon Process Created event

We’ll ignore for now how much my UI leaves to be desired. Here is perhaps the most commonly of interest event to you: Process Created. In this event you have a wealth of data including not only the location of the executable, launch command and parent processes, but the MD5 and SHA256 hashes of the file. You can also get the import hash here too – though I’d forgotten to turn it on for this run. You can see what ran, from where, by whom, and how it was run, in a glance.

Sysmon File Created event

Next up we can log the act of creating a file; in this case a trojan makes new copy of itself which is placed in C:\Users\<username>\System\Library\mshost.exe.

Sysmon Registry Value Set event

You can also monitor for interesting things happening in the registry. This is one of the primary methods by which malware achieves “persistence”, i.e. the ability to remain active on the system it infects. Here we can see a new entry being created in one of the user’s Run keys.

Sysmon Network Connection event

In a final example, Sysmon allows you to detect initiation of network connections; not only do we have the network level data of the destination IP and port captured, but the destination hostname is also identified.

In just four event types, Sysmon is able to record the malware starting, hiding itself, achieving persistence, and contacting its Command and Control server. This is the power of logging API calls. But wait – there’s more! This only scratches the surface of what Sysmon can do. It is also capable of identifying:

A process changing the creation time of a file
Process termination
Loading of drivers
Loading of additional modules in to existing processes
Creation of threads within other running processes
Raw access to the disk (as opposed to using the file system APIs)
Access to another process’s memory
Creation of alternate data streams
Use of named pipes (a method of communicating between processes)
Use of Windows Management Instrumentation

As you can see, it’s a fantastic tool which would be pretty hard to top if you decided to try doing this yourself. If you are thinking of experimenting with malware – or looking for something to help you keep a closer eye on your systems in general – I can’t recommend it enough.

In part 3 I will discuss the use of IDS and packet capture tools to get detailed information on the malware’s communication.

unsafehex

Tag Archives: APIs

Automating a sandbox: Guest VM control