The Modexp blog has a great collection of compression algorithms from 8-bit computers, demos, and viruses. I noticed that most of them are variations on the Lempel-Ziv theme. This raised a few questions for me: Is it possible to make compression "polymorphic" so that it would be impossible to create a signature for the compressed data itself? And another question: Can the same algorithms be used for the opposite task — entropy normalization? (Compressed text has high entropy, and antivirus software often uses entropy as an indicator that the file is compressed and requires deeper analysis).
herm1t
Wednesday, April 2, 2025
Sunday, March 30, 2025
Random hashes and bloom filters
To do something useful, a virus or a beacon needs system functions. The obvious way is to simply compile a list of the necessary functions, thus simplifying the work for antivirus and reverse engineers. To prevent strings from being visible in the code, the string itself can be replaced with a hash. It seems that the first person to come up with this technique was StarZero from IKX group. In his virus Voodoo (autumn 1998), he used CRC32 to search for the necessary APIs. Despite the fact that over twenty-five years have passed, it is still used in products such as Cobalt Strike and Metasploit.
But why CRC32? In 2004, Z0mbie suggested generating hash functions randomly, using the template H = ROL(H, x) + C. Since the list of all function names is known, one can check the hash function for collisions. And such a hash (ROR 13/ADD) is used in Cobalt Strike. Moreover, if the constant in the ROR instruction is changed, the number of detections is significantly reduced.
Revisiting pseudo-random index decryption
This article was published in tmp.out #4
Almost twenty-five years ago, Mental Driller discovered an interesting technique that allows constructing a decryption cycle in such a way that each element is accessed in a random order [1]. The algorithm looks quite simple, but even after it caught the attention of well-known AV researchers [2], they could not explain why the algorithm works correctly. In the conclusion of his article, Frederic Perriot wrote:
I suspect the bijectivity of the family of endomorphisms studied here is a general property of some families of functions over well-known groups or fields. If you have a mathematical background and would like to share arithmetic insights on this problem, I'd be grateful if you can send me the explanation...