EMOTET is a crime ware loader. The affiliates are — TrickBot, Zeus Panda, IcedID, and so on. US-CERT has published alert on the malware in July, 2018 [1].
In this post, I’ll focus on automatically decrypt EMOTET’s strings by using IDAPython. Most of the malware encrypts the strings to avoid static analysis, and EMOTET has no exception.
After we unpack the sample, and drag it into IDAPro, we see many similar code snippets in many functions like the following:
push 0A7DD855
mov ecx, offset unk_411FC0
call sub_401D10
Take a look into it, and confirmed this might be a decrpytion function. There is a simple XOR routine as core decrypt function:

The function is used to decrypt all the strings, so there are a bunch of corss references to the function.

Since I’m too lazy to use debugger to perform dynamic analysis to get the decrypted strings. So, I’m going to reverse the decryption routine and let IDAPython do the job.
A few things are neccessary for a general simple decryption routine — a ciphertext, a key, and maybe the size. From a experienced guess, the pointer reference to .data section might be the ciphertext, the strange argument push on to stack (for example .text:0x401058 push 0x46DA9F26h) might be the key, and the remain one is the size (0x14).

Reverse the function. The resource strings/ciphertext is in the format:
struct resource_strings {
unsigned int size;
DWORD buf[];
}
The decrpytion is proceed block by block (DWORD as a block). The real string size is been embedded in the fist element of resource_strings data structure. So, the size argument actually useless.
def decrypt_function(ciphertext, key, size):
real_size = ciphertext.size ^ key
block_num = (4*((size>>2)-1)+3)>>2
for b in block_num:
ciphertext.buf[4*b:4*(b+1)] ^ key
ciphertext.buf[real_size] = "\x00"
So, let’s build our IDAPython script. We need — First, identify the decrpytion function. Second, decrpyt the strings. Fianlly, name the decryption function and comment the strings to where code reference the function.
To identify the decryption function, I use a code signature of core decryption routine.
def find_decrypt_function():
'''
decryption routine
inc edi
mov [ecx-8], ax
mov eax, edx
shr eax, 8 ...
'''
pattern = '47 66 89 41 F8 8B C2 C1 E8 08 0F B6 C0 66 89 41 FA C1 EA 10'
min_ea = get_inf_attr(INF_MIN_EA)
decrypt_function_ea = find_binary(min_ea, SEARCH_DOWN, pattern)
decrypt_function_ea = get_func(decrypt_function_ea).startEA
set_name(decrypt_function_ea, decrypt_function_name, SN_CHECK)
Then, get all code references. With the referencers’ addresses, I step instructions up to find correct arguments.
def get_ciphertext(ea):
if print_insn_mnem(ea) == 'mov' and\
'ecx' in print_operand(ea, 0):
return get_operand_value(ea, 1)def get_key(ea):
if print_insn_mnem(ea) == 'push' and\
get_operand_type(ea, 0) == 5:
return get_operand_value(ea, 0)def get_size(ea):
#Actually the size argument does not matter
passfor ea in CodeRefsTo(get_name_ea_simple(decrypt_function_name), 0):
ciphertext = None
key = None
for _ in range(0,0x100):
ea = prev_head(ea)
if not ciphertext:
ciphertext = get_ciphertext(ea)
if not key:
key = get_key(ea)
if key and ciphertext:
break
if key == 0 or ciphertext == 0:
print "Something wrong!!!"
continue
Alright, that’s all. Now, I can usedecrypted strings from IDAPro to keep doing static analysis.

The script: https://gist.github.com/levwu/41a1fc608147e01ff620c3ff71b753d8