A Study on ConfuserEx Control Flow Flattening Technique

ZYWU
5 min readJan 16, 2019

--

Recently, I came across an infostealer malware called HawkEye. HawkEye is written in C#. In this post, I’m going to share the control flow obfuscation techniques in HawkEye sample I have.

The sample and analysis can be found here — https://cloudblogs.microsoft.com/microsoftsecure/2018/07/11/hawkeye-keylogger-reborn-v8-an-in-depth-campaign-analysis/

Here are some tools I’ve used for analyzing C# malware —

  • dnSpy: .NET debugger and assembly editor
  • de4dot: .NET deobfuscator and unpacker
  • dnlib: Reads and writes .NET assemblies and modules
  • IDAPro

After went through the unpacking procedure, we can find two important PEs in the resource — injector and payload. After dumping out the payload, it seems like named “Reborn Stub” and obfuscated by ConfuserEx v1.0.0.

The module information of HawkEye’s payload in dnSpy view.

The code has been obfuscated by ConfuserEx’s SwitchMangler. The technique is also called control flow flattening.

The obfuscated (left) vs original (right) control flow.

I’m going to study how ConfuserEx implement the obfuscation. Noted that the tool has many different kinds of obfuscation and control flow flattening is just one of them.

In a high-level view, the obfuscator do these steps —

  1. Split the code trace into blocks
  2. Prepare the keys and arithmetic instructions for chaining blocks
  3. Install the chaining instructions into each blocks and peform different actions by the control flow type

The source code can be found here.

Split the code trace into blocks

First, the obfuscator split the trace into small blocks in two situations —

  1. Basic block
LinkedList<Instruction[]> SpiltStatements(InstrBlock block, Trace trace, CFContext ctx)
{
[...Eliminated]
switch (instr.OpCode.FlowControl) {
case FlowControl.Branch:
case FlowControl.Cond_Branch:
case FlowControl.Return:
case FlowControl.Throw:
shouldSpilt = true;
[...Eliminated]
}

2. A given intensity with a random factor

LinkedList<Instruction[]> SpiltStatements(InstrBlock block, Trace trace, CFContext ctx)
{
[...Eliminated]
if ((instr.OpCode.OpCodeType != OpCodeType.Prefix && trace.AfterStack[instr.Offset] == 0 &&
requiredInstr.Count == 0) &&
(shouldSpilt || ctx.Intensity > ctx.Random.NextDouble())) {
statements.AddLast(currentStatement.ToArray());
currentStatement.Clear();
[...Eliminated]
}

The reason why chopping the instructions in a basic block is to make control flow looks nasty (Imaging each block has only a few instructions).

Prepare the keys and arithmetic instructions for chaining the blocks

In this step, the obfuscator builds a control flow that first call to the switch block, then switch block run specific block. After the block finished it’s instruction, it goes back to the switch block with setting what is next block to execute. Just like —

num = 2;
System.Console.WriteLine(num);
return 0;
# The flatten example
v = 2
switch(v)
{
case 1:
System.Console.WriteLine(num);
v = 3;
case 2:
num = 2;
v = 1;
case 3:
return 0;
}

In the ConfuserEx, it uses arithmetic computation to make analysis more difficult. For example, the code above will be compiled into something like —

private static int Main(string[] args)
{
int result;
for (;;)
{
IL_01:
uint num = 2254568472u;
for (;;)
{
uint num2;
switch ((num2 = (num ^ 3037644272u)) % 6u)
{
case 1u:
num = (num2 * 223771410u ^ 648192441u);
continue;
case 2u:
num = (num2 * 909441097u ^ 4218481382u);
continue;
case 3u:
goto IL_01;
case 4u:
{
int value = 2;
Console.WriteLine(value);
num = (num2 * 2137906813u ^ 2826615287u);
continue;
}
case 5u:
result = 0;
num = (num2 * 3850219098u ^ 1166622248u);
continue;
}
return result;
}
}
}

If carefully trace into it, the control flow might be case 4 -> case 1 -> case 5 -> case 2. To build those chained blocks, obfuscator use the following code snippet to generate them.

var keyId = Enumerable.Range(0, statements.Count).ToArray();
ctx.Random.Shuffle(keyId);
var key = new int[keyId.Length];
for (i = 0; i < key.Length; i++) {
var q = ctx.Random.NextInt32() & 0x7fffffff;
key[i] = q — q % statements.Count + keyId[i];
}

Install the chaining instructions into each blocks and peform different actions by the control flow type

The block might be ends in three different condition — unconditional branch, conditional branch and general instructions. Considering the following demonstration code, which has three conditions —

public class Test
{
static long plus(long a, long b)
{
return a+b;
}

static int Main(string[] args)
{
long num = Int64.Parse(args[0]);
if(num != 0)
{
System.Console.WriteLine("arg not 0");
if (num + 4 > 30) //conditional branch
{
System.Console.WriteLine("arg > 26");
switch(num)
{
case 1:
System.Console.WriteLine("num is 1");
break; //unconditional branch
case 2:
System.Console.WriteLine("num is 2");
break;
}
}
else
{
plus(num, (long) 10);
}
System.Console.ReadKey();
return 0;
}
}
}

First condition meet if(num!=0), which is a conditional branch. The comparison is performed, then value for correct control flow is assigned.

Or can be desribe as —

case 15:
{
long num3;
bool flag = num3 == 0;
num = ((flag ? 0xA1EB95FD : 0xC7D152B6) ^ num2 * 0x7E488667)
continue;
}

From the source code —

newStatement.Add(Instruction.Create(condBr, brKeyInstr));
newStatement.Add(nextKeyInstr);
newStatement.Add(Instruction.Create(OpCodes.Dup));
newStatement.Add(Instruction.Create(OpCodes.Br, pop));
newStatement.Add(brKeyInstr);
newStatement.Add(Instruction.Create(OpCodes.Dup));
newStatement.Add(pop);

For genearl blocks, obfuscator install the chaining operation —

newStatement.Add(Instruction.Create(OpCodes.Ldloc, local));
newStatement.Add(Instruction.CreateLdcI4(r));
newStatement.Add(Instruction.Create(OpCodes.Mul));
newStatement.Add(Instruction.Create(OpCodes.Ldc_I4, (thisKey * r) ^ targetKey));
newStatement.Add(Instruction.Create(OpCodes.Xor));

For example —

Or decompiled pseudocode —

case 3:
{
long num3;
bool flag = num3 + 4 < 30;
num = (num2 * 0xB7206979 ^ 0x6879278B);
continue
}

Among three types of control flow, obfuscator always takes care if any block has some unknown references. A unknown reference/source exist when 1. the block is the target of switch, 2. the reference is not within current instruction block, and 3. the reference is not the last of statements. If the block has unknown references/sources, the trace might possibly come from (so-called) unknown trace and it has no idea on the value for control flow arithmetic. So, whenever the situation exists, obfuscator simply load the value on to stack and then jump back to switch box. For example, —

It is always fun to study obfuscation techniques. See you next time.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

--

--

ZYWU
ZYWU

Responses (1)

Write a response