Jump to content

Perceptual hashing

From Wikipedia, the free encyclopedia
This is an old revision of this page, as edited by Armduino (talk | contribs) at 20:43, 24 May 2022 (Development: Marr–Hildreth algorithm). The present address (URL) is a permanent link to this revision, which may differ significantly from the current revision.

Perceptual hashing is the use of a fingerprinting algorithm that produces a snippet or fingerprint of various forms of multimedia.[1][2] A perceptual hash is a type of locality-sensitive hash, which is analogous if features of the multimedia are similar. This is not to be confused with cryptographic hashing, which relies on the avalanche effect of a small change in input value creating a drastic change in output value. Perceptual hash functions are widely used in finding cases of online copyright infringement as well as in digital forensics because of the ability to have a correlation between hashes so similar data can be found (for instance with a differing watermark).

Development

The 1980 work of Marr and Hildreth is a seminal paper in this field.[3]

The 2010 thesis of Zauner is a well-written introduction to the topic.[4]

Already in 2016, Asgari published work on robust image hash spoofing. Asgari notes that perceptual hash function like any other algorithm is prone to errors.[5]

Characteristics

Research reported in January 2019 at Northumbria University has shown for video it can be used to simultaneously identify similar contents for video copy detection and detect malicious manipulations for video authentication. The system proposed performs better than current video hashing techniques in terms of both identification and authentication. [6]

Research reported in May 2020 by the University of Houston in deep learning based perceptual hashing for audio has shown better performance than traditional audio fingerprinting methods for the detection of similar/copied audio subject to transformations.[7]

In addition to its uses in digital forensics, research by a Russian group reported in 2019 has shown that perceptual hashing can be applied to a wide variety of situations. Similar to comparing images for copyright infringement, the group found that it could be used to compare and match images in a database. Their proposed algorithm proved to be not only effective, but more efficient than the standard means of database image searching.[8]

A Chinese team reported in July 2019 that they had discovered a perceptual hash for speech encryption which proved to be effective. They were able to create a system in which the encryption was not only more accurate, but more compact as well.[9]

Apple Inc reported as early as August 2021 a Child Sexual Abuse Material (CSAM) system that they know as NeuralHash. A technical summary document, which nicely explains the system with copious diagrams and example photographs, offers that "Instead of scanning images [on corporate] iCloud [servers], the system performs on-device matching using a database of known CSAM image hashes provided by [the National Center for Missing and Exploited Children] (NCMEC) and other child-safety organizations. Apple further transforms this database into an unreadable set of hashes, which is securely stored on users’ devices."[10]

In an essay entitled "The Problem With Perceptual Hashes", Oliver Kuederle produces a startling collision generated by a piece of commercial neural net software, of the NeuralHash type. A photographic portrait of a real woman (Adobe Stock #221271979) reduces through the test algorithm to the same hash as the photograph of a piece of abstract art (from the "deposit photos" database). Both sample images are in commercial databases. Kuederle is concerned with collisions like this. "These cases will be manually reviewed. That is, according to Apple, an Apple employee will then look at your (flagged) pictures... Perceptual hashes are messy. When such algorithms are used to detect criminal activities, especially at Apple scale, many innocent people can potentially face serious problems... Needless to say, I’m quite worried about this."[11]

See also

References

  1. ^ Buldas, Ahto; Kroonmaa, Andres; Laanoja, Risto (2013). "Keyless Signatures' Infrastructure: How to Build Global Distributed Hash-Trees". In Riis, Nielson H.; Gollmann, D. (eds.). Secure IT Systems. NordSec 2013. Lecture Notes in Computer Science. Vol. 8208. Berlin, Heidelberg: Springer. doi:10.1007/978-3-642-41488-6_21. ISBN 978-3-642-41487-9. ISSN 0302-9743. Keyless Signatures Infrastructure (KSI) is a globally distributed system for providing time-stamping and server-supported digital signature services. Global per-second hash trees are created and their root hash values published. We discuss some service quality issues that arise in practical implementation of the service and present solutions for avoiding single points of failure and guaranteeing a service with reasonable and stable delay. Guardtime AS has been operating a KSI Infrastructure for 5 years. We summarize how the KSI Infrastructure is built, and the lessons learned during the operational period of the service.
  2. ^ Klinger, Evan; Starkweather, David. "pHash.org: Home of pHash, the open source perceptual hash library". pHash.org. Retrieved 2018-07-05. pHash is an open source software library released under the GPLv3 license that implements several perceptual hashing algorithms, and provides a C-like API to use those functions in your own programs. pHash itself is written in C++.
  3. ^ Marr, D.; Hildreth, E. (29 Feb 1980). "Theory of Edge Detection". Proceedings of the Royal Society of London. Series B, Biological Sciences. 207 (1167): 187–217. doi:10.1098/rspb.1980.0020. PMID 6102765.
  4. ^ Zauner, Christoph (July 2010). Implementation and Benchmarking of Perceptual Image Hash Functions (PDF). University of Hagenburg.
  5. ^ Asgari, Azadeh Amir (June 2016). Robust image hash spoofing (PDF). Blekinge Institute of Technology.
  6. ^ Khelifi, Fouad; Bouridane, Ahmed (January 2019). "Perceptual Video Hashing for Content Identification and Authentication" (PDF). IEEE Transactions on Circuits and Systems for Video Technology. 29 (1): 50–67. doi:10.1109/TCSVT.2017.2776159. S2CID 55725934.
  7. ^ Báez-Suárez, Abraham; Shah, Nolan; Nolazco-Flores, Juan Arturo; Huang, Shou-Hsuan S.; Gnawali, Omprakash; Shi, Weidong (2020-05-19). "SAMAF: Sequence-to-sequence Autoencoder Model for Audio Fingerprinting". ACM Transactions on Multimedia Computing, Communications, and Applications. 16 (2): 43:1–43:23. doi:10.1145/3380828. ISSN 1551-6857.
  8. ^ Zakharov, Victor; Kirikova, Anastasia; Munerman, Victor; Samoilova, Tatyana (2019). "Architecture of Software-Hardware Complex for Searching Images in Database". 2019 IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (EICon Rus). IEEE. pp. 1735–1739. doi:10.1109/EIConRus.2019.8657241. ISBN 978-1-7281-0339-6. S2CID 71152337.
  9. ^ Zhang, Qiu-yu; Zhou, Liang; Zhang, Tao; Zhang, Deng-hai (July 2019). "A retrieval algorithm of encrypted speech based on short-term cross-correlation and perceptual hashing". Multimedia Tools and Applications. 78 (13): 17825–17846. doi:10.1007/s11042-019-7180-9. S2CID 58010160.
  10. ^ "CSAM Detection - Technical Summary" (PDF). Apple Inc. August 2021.
  11. ^ Kuederle, Oliver (n.d.). "THE PROBLEM WITH PERCEPTUAL HASHES". rentafounder.com. Retrieved 23 May 2022.