Free cookie consent management tool by TermsFeed Computing Jaccard dissimilarity using Python | Pythontic.com

Computing Jaccard dissimilarity using Python

Overview:

  • Two objects that are analysed and that are going to be grouped may have several binary attributes. For example, Fruit Colour is Yellow – Yes, or No if the fruit colour is not yellow. The tree grows in rain forests – Yes, or No tree does not grow in rain forest. In this way the two objects under consideration have many properties one may possess or one may not possess.

  • The Boolean properties or traits on two objects will result in the combinations of 11, 10, 01 and 00. Such combinations and their counts are captured in a table called contingency table. The table is also called as an association table.

  • By counting and working on the number of such combinations several dissimilarity metrics come in place. Metrics like the Dice dissimilarity provide more weightage to scenarios when both the attributes evaluate to True. i.e, The binary combination '11' in the contingency table.

  • The Jaccard dissimilarity distance is given by

    • J(i, j) = b + c/a + b + c

  • In other words J(i, j) = count(10) + count(01) / count(11) + count(10) + count(01)

Example:

Species

White

Feeds on Fish

Flies

Pouched Beak

Two legs

Crane

1

1

1

0

1

Pelican

1

1

1

1

1

Snow Bear

1

1

0

0

0

j(Crane, Pelican) = 0+1/4+1+0

j(Crane, Pelican) = 0.2

The Jaccard dissimilarity between Crane and Pelican is 0.25 meaning the species are different to the extent of 20%.

# Example Python program that finds the Jaccard dissimilarity
# between two species. The example uses the features
# and traits of the species in binary form.
import scipy.spatial.distance as dist

features_crane         = [1, 1, 1, 0, 1]
features_pelican     = [1, 1, 1, 1, 1]

jaccard_dissimilarity = dist.jaccard(features_crane, features_pelican)
print("Jaccard dissimilarity between the species Crane and Pelican:")
print(jaccard_dissimilarity)

Output:

Jaccard dissimilarity between the species Crane and Pelican:
0.2

 


Copyright 2025 © pythontic.com