Sooner or later we all get spam or phishing emails. For enterprise, phishing emails represent the most common vector by which adversaries gain a foothold into the network. In the last 12 months, Microsoft reported a 250% increase in phishing email detections, and phishing that targeted SaaS and webmail services doubled in the previous quarter.
We’ve discussed elsewhere tactics for resisting phishing attempts, but in this post we’ll take a deeper-look at how a phishing email works, revealing just how easily victims can give away their Paypal credentials with this kind of social engineering attack.
Behind the Link: HTML File Properties
For this walk-through, we’ll use a phishing email that I received recently through an alias that was set up to collect malware samples. Let’s start by seeing the file hash for the HTML file they sent me:
$: shasum PayPal_Document916.html
948fa2be822a9320f6f17599bc2066b2919ff255 PayPal_Document916.html
Let’s take a look on VirusTotal and see what comes back.
So no one knows what this is. Let’s take a closer look at the file with the Detect-It-Easy tool:
From DIE we can confirm the following file properties:
- File type: Plain Text HTML
- Entropy: 6.056
- Packed: No
- File Size: ~34k
Great! The file isn’t packed and the entropy indicates that the contents are probably only lightly obfuscated. Let’s take a look inside.
Inside the Phish: Exploring the Content
Now it’s probably not ideal to just double click the file to peek inside since HTML files can run on any OS, and we still do not know what OS this sample is targeting. It is important to view the contents, but we want to do so using a very basic text editor. Use what you like, but make sure that you open it as if it were a regular text file so that nothing is executed.
As beautiful as this is, it is kind of hard to read. Fortunately there are things that we can do to make this easier to read, but before we make alterations we are going to remove all the HTML code tags:
<!DOCTYPE html><html><head><script>
and
</script></head><body></body></html>
All that is left should be the JavaScript code, and this is where SublimeText has some nice features that can really help us out:
I will use a “beautifier” plugin to clean up the code and make it easier to read:
You can do this via the CLI in macOS/Linux but it is not as nice looking:
$: awk -v RS=';' -v ORS=';n' 'NF' PayPal_Document916.html
With this reformatted version of the Javascript, I am going to write this new version to a secondary file called:
decoded_PayPal_Document916.js
Murky Waters: Decoding the Text
Lets clean up some of the variables and start trying to understand what’s going on here.
First we know that:
var nxjCDAXFwFEX=
holds the raw Base64 code block (naturally assumed to be the embedded payload).
Let’s try to write this to a file and see what we get:
$: grep nxjCDAXFwFEX= decoded_PayPal_Document916.js|awk -F '"' '{print$6}'|base64 --decode >> payload
Note that this command is used to isolate the Base64 encoded string, but there are two important things to note about the line that contains our string:
1) At the start of the string:
return g=x.join(""),g.replace(/+$/,"")}var nxjCDAXFwFEX="
2) At the end of the string: ";
We need to remove everything before and after the Base64 encoded string in order to try and decode it. To do this we will use:
awk -F '"' '{print$6}'
This command simply splits the string you target into columns using spaces as delimiters. We will change the delimiter to a double quote using the -F '"'
switch. Now that we have an isolated string to work with, we can decode it. Keep in mind that it’s not always as easy as decoding the base64 string. In this case, the encoded block looks odd when written to a file:
$: file payload
payload: data
A data file is not necessarily an indicator that a string failed to decode, but it does mean that there is a possibility the decode failed. Either way, we should review the code to look for logic that we may have overlooked.
A Clearer View: Isolating Variables
When renaming variables everyone has their own methods. Personally, I start easy with the ones that are obvious. We already know that nxjCDAXFwFEX
contains the Base64 code string so I am going to change all occurrences of nxjCDAXFwFEX
to raw_base64
and see what else we can find.
This can take a good deal of time. I have a lab that I use to run samples in without fear of infecting everyone in the office! So to speed things up, I simply copy the decoded_PayPal_Document916.js
to a VM. For this I am going to use Linux since nearly everything wants to kill Windows dead.
To further de-obfuscate the variables in the script code block, we can do a few things. It’s a bit tedious, but we can put print statements for each variable to see what output they give and then name them accordingly. For now, there are some obvious ones that we can replace:
So let’s do some renaming:
xoCisgpExGEs –> function_01
- This the first function that we can see in the script: function xoCisgpExGEs(rr,oo)
sCmCMuMlIZJy –> function_02
- This is the second function that we see in the script: function sCmCMuMlIZJy(rr)
nxjCDAXFwFEX –> raw_base64
- The variable that only contains our base64 encoded string.
lSiYOlcTTfmR –> call_array
- The primary list of arguments.
TZGYADnjYnzp –> function_02_call
- This simply calls the second function in the script:
lSiYOlcTTfmR:
-
- lSiYOlcTTfmR[0] –> cyQvdxDbHhpBfpCX
- This is just the first value in the “lSiYOlcTTfmR” (renamed to “call_array”) array.
- lSiYOlcTTfmR[1] –> write
- This is just the second value in the “lSiYOlcTTfmR” (renamed to “call_array”) array.
- lSiYOlcTTfmR[0] –> cyQvdxDbHhpBfpCX
Note: We will actually remove this whole variable since we know where these values are used.
In this case the very last line of the script is the execution statement, so we will simply comment it out and apply new code to dump the output to 2 files:
Post execution, it’s plain to see that the “function_02_call” still looks like gibberish, so we will ignore it for now. The “function_01_call”, however, looks like it gave us a lot of great new code to review:
The output file contained code that was all on a single line and yet again a plugin was used to beautify the code (get used to cleaning up code!).
Suspicious Domains
Right off the bat, since we already suspect this to be a phishing attack, it might be nice to see all the Domains that are coded in the page.
The PayPal domains for the most part are not that interesting because those domains are known to be legit, and it would be hard (though not impossible) to spoof those. These look more interesting:
Let’s run some cURLs to check a few things:
While we’re cURLing things, we might as well download the PNG files and check their hash reputations
$: shasum *.png
f18a83299a9dbf4905e27548c13c9ceb8fb5687d AM_mc_vs_ms_ae_UK.png
53b7e80a8a19959894af795969c2ff2e8589e4f0 bdg_secured_by_pp_2line.png
b311f639f1de20d7c70f321b90c71993aca60a44 pp-logo-200px.png
These files appear to have a low chance of being malicious:
Let’s focus on a domain that does not belong to PayPal. There’s a variable in the code that I came across that I want to take a closer look at, _0x78eb7f
:
The logic is fairly simple: when the user clicks the submit button after inputting their credit card information, the page script will alter the destination of where the end user input data is sent to.
In this case, rather than PayPal, the user’s information is sent to the attacker’s web server.
Conclusion
From this basic analysis we’ve identified exactly how the attack works and the domain of the attacker’s server. If it were still live, we would now be able to ensure it’s blocked across all our endpoints by using, for example, the SentinelOne Firewall Control.
While this phishing email is not as complex as a lot of attacks that we have read about and experienced to date, it is very easy to overlook. This type of attack can often be even more effective than more modern attacks. As Security professionals we can use this example as a small reminder that we should be educating our friends, family, and end users about questioning the validity of emails at all times.