Tag Archives: obfuscation

Angler knows when you’re fakin’ it

A brief introduction to Angler for those who are new to it: Angler is a framework of malicious code that criminals can purchase/rent and deploy on web servers they have control over – usually ones they have compromised rather than renting for themselves. It is typically comprised of an initial component which is injected into a site you are likely to visit, which redirects you to the second location, known as the exploit kit landing page. The landing page has code which tests what features and capabilities your browser has, in order to identify whether you are vulnerable to any of the exploits the criminals have access to; this is known as “browser enumeration”. This page often contains the exploits themselves, but they can sometimes be called from another location. Finally the exploit will contain commands that will attempt to download and run a program, which could be anything – a remote access tool, ransomware, you name it. If you want to know a bit more about what comes out the business end of an Angler chain, there are many examples on places like Malware Don’t Need Coffee and Cisco Talos.

Angler has developed steadily, adding new features in its efforts to more effectively select vulnerable users, better evade detection, and increase the difficulty of analysing it. The creators have added polymorphism on all stages of the code so that it changes every time the page loads, making it much more difficult for antivirus and intrusion detection systems to recognise. There are up to three layers of obfuscation in the javascript. The most recent trick I have seen is a little call that doesn’t seem to do much in terms of sorting potential victims from the crowd, but made analysing it a bit more of a challenge.

1 raw page

The above image will be familiar to anyone who has investigated Angler. Dozens of variables with completely random names, and an eval to turn them into some valid code. Looks like this one has been inserted into a header PHP file of some sort as it’s appearing even before the opening HTML tags. This particular group becomes:

eval(String.fromCharCode.apply(null,document.getElementById("xndotnjmjwlj").innerHTML.split(" ")))

Getting the text from a hidden div and turning it from a series of character codes into text, then evaluating. The hidden div part is a bit new but nothing particularly special (previously it just put the text to decode directly in a string variable). That text becomes the following:

2 angler decoded div 1

On line 14 it is getting the code from another div and then feeding it through a loop which does some maths on the characters and constructs another string. That doesn’t sound too complicated, but try as I might I just could not get this code to do its stuff, not until I started stepping through it and watching what values got assigned to the variables in each step; and as it turns out, the problem is on the very first line.

kaypkiafgibrwvp = (+[window.sidebar])

The lines that follow it create a loop with this value as the start, and test whether the strings “rv:11” and “MSIE” are present in the user agent – a fairly standard way of detecting what browser is coming to call. Why was this code not running properly? Presumably if it’s a for loop this code should start with kaypkiafgibrwvp being 0, so let’s check that bit – in fact you can try this out yourself. Bring up the developer console in your browser (Firefox: ctrl+shift+k; Chrome: ctrl+shift+i; IE: F12) and put the text in the box into the console. In IE, Chrome and Safari, you’ll get “0”. But in Firefox and some javascript debugging tools, you’ll get “NaN”.

That “NaN” breaks the code – it will never send your browser anywhere. So in order to get this code to turn out something useful, I’ll need to replace at least that value with 0 rather than the output of the expression. The other value that’s set here from browser variables, othddelxtnfae, is used as a modulus a bit further down so it’s necessary to make sure that one’s correct too; a little tinkering showed that the correct value for this variable was 2 – indicating the UA must contain “rv:11” or “MSIE10”, but not both.

3 final redirect formatted

And there you have it, an iFrame hidden off the top of the page, going to the next part of Angler.

Given that the call to window.sidebar only eliminates one of the major browsers from contention it seemed unlikely to me that this would be the reason for including it – an anti-analysis technique seemed more likely. A bit of searching reveals that this is because some popular analysis tools depend on open source code from the Firefox javascript engine… and Trustwave basically wrote everything in this post two days ago. Lesson learned: analysis first, beauty sleep later.

Beginning your journey into javascript deobfuscation

Obfuscation of code is an interesting technique used by malware authors to conceal what their code is doing from anyone looking at it. Finding out what it does can be tricky, especially if you’re not sure how to start. So, when a wave of WordPress compromises broke out a few months back and Zscaler wrote a nice overview of what the compromised sites were trying to do, I thought it was an excellent opportunity to go through the javascript they quoted and show exactly how to unwrap it; especially given that it’s a relatively simple example.

The first thing you need to do is to find the page with the obfuscated javascript on it. I can’t tell you exactly how you should do this; it will depend on how you are encountering the issue – it could be that it’s your WordPress install that has been owned, or it could be that you’re seeing alerts from your intrusion detection or antivirus systems that you need to investigate – any number of reasons. Once you have identified the source of the alert (proceed very carefully from here on in!) you need to get the source code of the page that you’re investigating. Use something like curl or wget in Linux/OSX for safety. I would never suggest pointing a browser at it unless you really know what you’re doing.

Obfuscated code tends to stand out to human eyes because it creates patterns that are very distinct from the code on the rest of the page. There are different styles, so this example won’t teach you to spot all of them but believe me when I say that once you’ve seen a few you will be able to rapidly scroll through a page and spot the blocks of evil code. In my example, the evil code looks like this:

1 Raw JS

Distinguishing features include:

  • large wall of text where code above and below is relatively short and spaced out
  • random looking letters that make no sense
  • lots of little groups of numbers
  • a “document.write” command
  • not seen here, but an “eval” command is often associated with malicious javascript – worth mentioning because this one is relatively rare in benign code whereas “document.write” is common.

None of these is evil in itself but they’re common things to see in evil javascript.

The first thing you need to do is break out the stuff between the script tags into a separate file to play around with it. I’m going to do my decoding using Python because that’s what I’m used to but you can use pretty much any language for this. Find statements separated by semicolons and split them on to new lines so it’s easier to follow (for large, complicated blocks you might want to use a tool like JSPretty but this one is quite simple).

2 JS line by line

Once you’ve done that you can start to get an idea of what is in there. In this example, there are variables ‘a’, ‘b’, ‘c’, ‘clen’, then a for loop, and finally an unescape statement. Breaking down the for loop further, it does the following:

  • iterates through every character in ‘a’
  • in each iteration, get the numeric ASCII value of the character
  • bitwise XOR that value with 2
  • convert the resulting value into a new ASCII character
  • append the new ASCII character to a string

This is something that converts very readily to Python, like so:

3 JS to python

Followed by the output:

4 decode stage 1 output

This gives us a URL-encoded string (character codes preceded by % symbols). Python has a built-in library, urllib, which can decode this for us.

5 url encoding handling

Finally, we get a human-readable version of the content.

6 final output

The “document.write” command would have written the decoded content to the browser which as you can now see would have created an 0x0 sized iframe calling a resource from teaserguide[.]com. If you haven’t already read Zscaler’s article showing what happens after that, I highly recommend that you do so now.

This was a pretty low-level example; some malicious code has obfuscation three or more levels deep. Others use weird tricks in javascript to produce code that is mind-bendingly hard to pick apart such as the JJEncode technique (warning: you might need a stiff drink and a lie down after looking at that one – I did, and I take my hat off to whoever wrote the script to reverse it). I hope this was helpful to get you started.