[pHash-support] Interpretation of hamming distance results

Evan Klinger eklinger at gmail.com
Fri Apr 26 10:42:29 PDT 2013


Nothing. It was chosen arbitrarily based on a test set of images

Sent from my iPhone

On Apr 26, 2013, at 10:35 AM, Ricky Huang <rhuang.work at gmail.com> wrote:

> Thanks for the answer, Evan.
> 
> On the website, there's a threshold value (of 0.4 for MH algorithm), how is that determined?  What makes 0.4 a cut off between similar or dissimilar?
> 
> 
> 
> On Apr 25, 2013, at 11:03 AM, Evan Klinger <eklinger at phash.org> wrote:
> 
>> Ricky,
>> You cannot read too much into the hamming distance values. Like you said the lower the number the higher likelihood of it being the same or similar image. You cannot determine a percentage difference from the hamming distance nor derive any meaningful contextual information from it.
>> 
>> Thanks
>> Evan
>> 
>> 
>> On Wed, Apr 24, 2013 at 6:58 PM, Ricky Huang <rhuang.work at gmail.com> wrote:
>>> Hello pHash team,
>>> 
>>> I am wondering how does one read the return value of the hamming distance function.  While I understand that the smaller the value, the more similar the two hashes (consequently, the images), is there a rule of thumb on the interpretation of the actual value.  E.g., "if hamming distance > 0.50, do not bother with the image".
>>> 
>>> Also on the same topic, what does the ph_hammingdistance2() value actually mean?  E.g., "hamming distance of 0.4 means the hashes are 40% dissimilar", etc.
>>> 
>>> 
>>> Thanks!
>>> 
>>> 
>>> _______________________________________________
>>> pHash-support mailing list
>>> pHash-support at lists.phash.org
>>> http://lists.phash.org/listinfo.cgi/phash-support-phash.org
> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.phash.org/pipermail/phash-support-phash.org/attachments/20130426/835853d7/attachment-0001.htm>


More information about the pHash-support mailing list