string comparison in python

question Find the similarity percent between two strings

question

 similar("Apple","Appel") => 80%
 similar("Apple","Mango") =>  0%

answer

 from difflib import SequenceMatcher

 def similar(a, b):
     return SequenceMatcher(None, a, b).ratio()

 >>> similar("Apple","Appel")
 0.8
 >>> similar("Apple","Mango")
 0.0

reference Fuzzy string comparison in Python, confused with which library to use [closed]

question

 import Levenshtein
 Levenshtein.ratio('hello world', 'hello')

 Result: 0.625

 import difflib
 difflib.SequenceMatcher(None, 'hello world', 'hello').ratio()

 Result: 0.625

answer

 difflib.SequenceMatcher => Ratcliff/Obershelp algorithm
 Levenshtein             => Levenshtein algorithm

FuzzyWuzzy: Fuzzy String Matching in Python

string similarity

 from difflib import SequenceMatcher
 m = SequenceMatcher(None, 'new york mets', 'new york meats')
 m.ratio() => 0.9626...

 fuzz.ratio('new york mets', 'new york meats') => 96

partial string similarity

 fuzz.ratio('yankees', 'new york yankees')       => 60
 fuzz.ratio('new york mets', 'new york yankees') => 75

 fuzz.ratio('yankees', 'new york yankees')       => 100
 fuzz.ratio('new york mets', 'new york yankees') => 69

out of order

 fuzz.ratio('new york mets vs atlanta braves', 'atlanta braves vs new york mets')          => 45
 fuzz.partial_ratio('new york mets vs atlanta braves', 'atlanta braves vs new york mets') => 45

 # token sort
 'new york mets vs atlanta braves' --> 'atlanta braves mets new vs york'
 fuzz.token_sort_ratio('new york mets vs atlanta braves', 'atlanta braves vs new york mets') => 100

 # token set
 s1 = 'mariners vs angels'
 s2 = 'los angeles angels of anaheim at seattle mariners'
 # after sort
 t1 = 'angels mariners vs'
 t2 = 'anaheim angeles angels los mariners of seattle vs'
 fuzz.token_set_ratio('mariners vs angels', 'los angels of anaheim at seattle mariners') => 90

 fuzz.token_set_ratio('sirhan, sirhan', 'sirhan') => 100

references

distance
source code
1. Levenshtein.c
2. fuzzywuzzy
doc
1. difflib