Friday, February 20, 2026

Not so perfect hash

The API hashing technique is used so frequently in malware that it has already received its own MITRE number. Meanwhile, all we need to do is find a known subset of strings within a known set. Ideally, the stored data—unlike the familiar four-byte hashes—should look “decent,” meaning they should be small numbers.

The first idea that came to mind was to take the list of all glibc exports and build a minimal perfect hash - that is, a hash function that maps n values to the numbers from 0 to n-1. One of the simplest and efficient approaches is CHD (Compress-Hash-Distribute). First, we compress the keys into buckets using one hash $h_1$, and then we choose a function $h_2$ and an offset $d$ such that $(h_2 + d) \mod n$ produces unique values, roughly like this:

        qsort(buckets, m, sizeof(struct bucket), cmp_bucket);

        int temp_used[32];

        int placed;
        for (uint32_t i = 0; i < m; i++) {
                struct bucket *bk = &buckets[i];
                if (bk->size < 1)
                        continue;

                placed = 0;
                for (uint32_t d = 0; d < n; d++) {
                        int ok = 1;

                        for (uint32_t j = 0; j < bk->size; j++) {
                                uint32_t k = bk->keys[j];
                                uint32_t h = (hashes[k] + d) % n;
                                if (used[h] != 0) {
                                        ok = 0;
                                        break;
                                }
                                for (int t = 0; t < j; t++)
                                        if (temp_used[t] == h) {
                                                ok = 0;
                                                break;
                                        }
                                temp_used[j] = h;
                        }

                        if (ok) {
                                disp[bk->index] = d;
                                for (uint32_t j = 0; j < bk->size; j++)
                                        used[temp_used[j]]++;
                                placed = 1;
                                break;
                        }
                }
                if (placed == 0)
                        break;
        }

The resulting values can be efficiently compressed using Elias-Fano encoding. But we don’t need hashes for the entire dictionary - only for a small subset of it - and we can store only those buckets that contain the values we care about. The first hash essentially becomes a Bloom filter (with k = 1), and since the number of keys after filtering is greatly reduced, we can bruteforce $h_2$ such that we get a perfect hash without any offsets (for entire dictionary bruteforce is unfeasible, that's why CHD uses displacements). Like this:

$ gcc nspf.c -O3 -o nspf # source on github
$ ./nspf
Read 2727 keys, 681 buckets, 16 needed
97: open, __libc_ns_makecanon, closelog,
151: fork, __assert_fail, __nss_configure_lookup, __res_context_query, pkey_get,
216: mmap, __cyg_profile_func_enter,
256: read, _nss_files_parse_servent, pthread_rwlock_tryrdlock, stderr,
287: wait, __libc_dn_expand, __pthread_mutex_lock, _obstack_newchunk, _thread_db_pthread_nextevent, svc_run,
291: free, _thread_db_pthread_cancelhandling, pthread_attr_getaffinity_np,
318: accept, _nss_files_gethostbyname_r, authunix_create, gettimeofday, pthread_attr_getschedpolicy, sgetsgent_r,
374: connect, __isoc99_vfscanf, _nss_files_gethostbyname3_r, getcwd,
424: bind, sched_setaffinity,
441: socket, __call_tls_dtors, __fxstat, _IO_feof, _nss_files_setsgent, pthread_mutex_clocklock,
457: dup2, __ns_name_ntop, _IO_do_write, argz_add, lgetxattr, mq_open, strfromf32,
566: write, __res_nquerydomain, isastream, wcstof32,
576: listen, timer_getoverrun, vscanf, wcsstr,
585: malloc, __isascii_l, __res_mailok,
615: close, __wcsncat_chk, locs, xdr_int16_t,
626: execve, __getdelim, endrpcent, fstatvfs64, iscntrl_l, ispunct_l, moncontrol, setpwent,
71 keys in 16 buckets
Found mod = 218 after 717 attempts
Filter:
97, 54, 65, 40, 31, 4, 27, 56, 50, 17, 16, 109, 10, 9, 30, 11,
Unpacking...
97, 151, 216, 256, 287, 291, 318, 374, 424, 441, 457, 566, 576, 585, 615, 626,
Checking...
129 accept
152 bind
37 close
15 connect
32 dup2
155 execve
93 fork
121 free
94 listen
159 malloc
172 mmap
117 open
9 read
160 socket
62 wait
26 write
h1 = 726f3963, h2 = 042e7e3f

16 bytes of auxiliary data and 16 values for the functions we need - both stored as single bytes - with an acceptable probability of collisions. And the hash functions themselves, along with their values, are random.

No comments:

Post a Comment