Loading [MathJax]/extensions/tex2jax.js

Sunday, April 13, 2025

XZ backdoor: hiding the key in opcodes

The XZ backdoor uses many interesting techniques, some of which are known, and some not. When I was reading Kaspersky's analysis, I was intrigued by the steganographic function that reconstructs the author's public key β€” required for signature verification β€” from the compiled code. The set of instructions in which the key is hidden β€” ADD, OR, ADC, SBB, AND, SUB, XOR, CMP and MOV β€” looked familiar to me. The Mistfall viruses (and as a tribute to Mistfall, Lacrimae) use invariants of these opcodes for mutation.

Let's try to store something in the opcodes.

The second bit in this group of instructions is called the direction bit. In the case of a register-to-register instruction (mod = 11 in the ModR/M byte), the same instruction will have two fully equivalent encodings:

09 c1	or     %eax,%ecx # D=0 reg = 0 (eax), rm = 1 (ecx)
0b c8	or     %eax,%ecx # D=1 reg = 1 (ecx), rm = 0 (eax)

/* MAIN TABLE */
opcode_t main_table[512] = {
// ...
/* 08 */
{ "OR",         Eb,Gb,__ },
{ "OR",         Ev,Gv,__ },
{ "OR",         Gb,Eb,__ },
{ "OR",         Gv,Ev,__ },
{ "OR",         AL,Ib,__ },
{ "OR",         rAX,Iz,__ },
{ "PUSH",       CS,__,__ },
{ "(bad)",      __,__,__ },
// ...

So, all we need to do is find the code section, disassemble it, locate the instructions by a mask, compare the bit we want to store with the second bit of the opcode, and if they differ β€” flip the D-bit and swap the reg and rm fields in the ModR/M byte:

for (Elf64_Addr i = init; i < fini; i += len) {
	hde64s hs;
	len = hde64_disasm(m + i, &hs);

	/* reg/reg 2-byte opcodes */
	if (hs.modrm_mod != 3 || len != 2)
		continue;

	/* ALU and MOV
	00ttt00t 11xxxyyy	; ttt r1,r2 (ADD,OR,ADC,SBB,AND,SUB,XOR,CMP)
	00ttt01t 11yyyxxx
	10001001 11xxxyyy	; mov r1,r2
	10001011 11yyyxxx	*/
	if ((m[i] & 0xC4) == 0x00 || (m[i] & 0xFC) == 0x88) {
		unsigned b = (m[i] >> 1) & 1;
		if (argc > 1) {
			/* set bit in insn */
			unsigned w = (argv[1][count / 8] >> (count % 8)) & 1;
			if (w != b) {
				/* toggle D-bit, swap reg and r/m */
				m[i] ^= 2;
				m[i + 1] = 0xc0 |
					(hs.modrm_rm << 3) | hs.modrm_reg;
			}
		} else {
			/* read bit from insn */
			buf[count / 8] |= b << (count % 8);
		}
		count++;
	}
}

Let's try it (The program reads its own code, embeds a message from the command line into it, saves a copy of itself, and then runs this copy without any parameters, extracting the message from the opcodes):

$ gcc ops.c -o ops # full source on github
$ ./ops SECRET
66 bits (8 bytes)
Running modified executable
66 bits (8 bytes)
'SECRET'
$ objdump -d ./ops > 1
$ objdump -d ./read > 2
$ diff 1 2 | grep '^[<>]' | head
< ./ops:     file format elf64-x86-64
> ./read:     file format elf64-x86-64
<     11e4:	31 ed                	xor    %ebp,%ebp
>     11e4:	33 ed                	xor    %ebp,%ebp
<     1475:	89 c2                	mov    %eax,%edx
>     1475:	8b d0                	mov    %eax,%edx
<     14b9:	89 c2                	mov    %eax,%edx
>     14b9:	8b d0                	mov    %eax,%edx
<     150a:	89 c2                	mov    %eax,%edx
>     150a:	8b d0                	mov    %eax,%edx

2 comments:

  1. Some random notes on backdoor. One could use __builtin_return_address(0) from ifunc-constructor no need to import anything to get the text of ld-linux.

    ReplyDelete
    Replies
    1. The import stuff is terribly bad, with nested range checks, trie-traversals and loop over the gnuhash chains without hash and doubtful verdef/verneed fuckery; if you got your stuff from link_maps it ought to be valid

      Delete