摘要

With the fast development of the Internet of Things (IoT) technology, there are more attacks against IoT devices, and IoT security issues have attracted considerable attention. Due to the widespread phenomenon that different IoT firmwares reuse the same code, code similarity comparison is an important technique for IoT security analysis. Fuzzy hashing generates fingerprints of files by converting them into intermediate expressions and hashing such expressions, evaluating the fingerprint similarity and thus evaluating the similarity of files that are not identical. In this article, we analyze and compare today's most widely used fuzzy hashing tools, and classify them in detail. In addition, we also analyze the advantages and disadvantages of different algorithms used by these fuzzy hashing tools. Finally, we select a few of the most convincing fuzzy hashing tools, such as ssdeep and TLSH, for performance comparison by experimental analysis on real firmware datasets.