A long time ago, I read an article by Z0mbie about disassembling and bit masks. Since many instruction encodings have regularities, instead of looking them up in a table, they can be replaced with a simple logical formula. For example, before disassembling the instruction code, we need to check the segment register prefixes (this refers to 32-bit code). Their codes are 26, 2E, 36, 3E, 64, 65. Z0mbie pointed out that six comparisons can be replaced with two comparisons using a bit mask:
and al, 11111110 ; 64/65
cmp al, 01100100
je __prefix_seg
...
and al, 11100111b ; 26/2E/36/3E
cmp al, 00100110b
je __prefix_seg
Is it possible to completely get rid of the tables and replace them with a logical formula?
First, let's generate a truth table for the flags, with eight inputs and one output, and run it through the Espresso optimization program:
#include <stdio.h>
int main(int argc, char **argv)
{
printf(".i 8\n.o 1\n");
for (unsigned i = 0; i < 256; i++) {
for (int j = 7; j >= 0; j--)
printf("%d", ((i >> j) & 1) ? 1 : 0);
printf("|%d\n", (i == 0x26 || i == 0x2e || i == 0x36 ||
i == 0x3e || i == 0x64 || i == 0x65) ? 1 : 0);
}
}
/*
$ gcc m0.c -o m0
$ ./m0 | ./espresso
.i 8
.o 1
.p 2
0110010- 1
001--110 1
.e
*/
Let's convert the resulting masks into hex
#include <stdio.h>
#include <stdlib.h>
int main(int argc, char **argv)
{
char s[32];
int f = 0;
while (gets(s)) {
if (s[0] != '-' && s[0] != '1' && s[0] != '0')
continue;
if (f == 0)
f = 1;
else
printf(" ||\n");
unsigned m = 0, v = 0;
for (int i = 0; i < 8; i++) {
m <<= 1;
v <<= 1;
if (s[i] != '-') {
m |= 1;
v |= (s[i] - '0') & 1;
}
}
printf("(i & 0x%02x) == 0x%02x", m, v);
}
puts("");
}
/*
./m0 | ./espresso | ./m2
(i & 0xfe) == 0x64 ||
(i & 0xe7) == 0x26
*/
Now, let's return to replacing the tables with formulas and do the same for the MODRM flag. I took the table from my LDE YAD.
/* MAIN TABLE */
opcode_t main_table[512] = {
/* 00 */
{ "ADD", Eb,Gb,__ },
{ "ADD", Ev,Gv,__ },
{ "ADD", Gb,Eb,__ },
{ "ADD", Gv,Ev,__ },
{ "ADD", AL,Ib,__ },
// ...
int arg(uint32_t type)
{
// ...
switch (a) {
case OP_M:
case OP_R: case OP_PR: case OP_VR:
case OP_C: case OP_D: case OP_G:
case OP_P: case OP_S: case OP_V:.
case OP_E: case OP_Q: case OP_W:
flags |= C_MODRM; aa++;
break;
// ...
int main(int argc, char **argv)
{
printf(".i 8\n.o 1\n");
for (int i = 0; i < 256; i++) {
unsigned f = arg(main_table[i].op1) |
arg(main_table[i].op2) | arg(main_table[i].op3);
for (int j = 7; j >= 0; j--)
printf("%d", ((i >> j) & 1) ? 1 : 0);
printf("|%d\n", f & C_MODRM ? 1 : 0);
}
}
$ ./m1 | ./espresso | ./m2 (i & 0xba) == 0x80 || (i & 0xf8) == 0xd8 || (i & 0xbc) == 0x84 || (i & 0xbd) == 0x29 || (i & 0xbe) == 0x22 || (i & 0xc4) == 0x00 || (i & 0xf1) == 0x80 || (i & 0xf2) == 0x80 || (i & 0xf4) == 0xd0 || (i & 0x74) == 0x00
And that's it, what were you expecting?
No comments:
Post a Comment