Free cookie consent management tool by TermsFeed Computing Canberra distance using Python | Pythontic.com

Computing Canberra distance using Python

Overview:

  • The Canberra distance gives the dissimilarity between two sequences.

  • The difference between the two corresponding smaller elements is given more weightage.

    The same difference occurring with bigger elements is given less weightage. This makes the measure sensitive when both the corresponding elements are small.(i.e, as they are close to zero)

  • As the elements move away from zero the sensitivity is reduced for the same differences between them.

  • The Canberra distance between two sequences is given by

d(x,y) = sum(|xi-yi|/|xi|+|yi|) where i is from 1 to n

  • Canberra distance is used in object recognition, bioinformatics, clustering, classification and computer security applications.

Example:

The computation of the first Canberra distance in the example is

distance = 0+0+1/3+1/5+1/5+1/7 .

The fraction 1/3 is the result of elements being closer to zero. The fraction 1/7 is the result of elements being away from zero.

# Example Python program that finds the Canberra distance between
# two sequences using the scipy function canberra()
import scipy.spatial.distance as dist

sequence1 = [0, 1, 1, 2, 3, 3]
sequence2 = [0, 1, 2, 3, 2, 4]

# Find the dissmilarity between two sequences
# with many elements closer to zero
distance = dist.canberra(sequence1, sequence2)
print("Canberra distance:{:.2}".format(distance))

sequence1 = [0, 4, 5, 6, 7, 7]
sequence2 = [0, 4, 6, 7, 6, 8]

# Find the dissmilarity between two sequences
# with most elements away from zero
distance = dist.canberra(sequence1, sequence2)
print("Canberra distance:{:.2}".format(distance))

Output:

Canberra distance:0.88
Canberra distance:0.31

 


Copyright 2025 © pythontic.com