r - regex to exclude 2 consecutive variations -
i trying fuzzy matching (in r) , want make rules how many consecutive variations allowed. example, if use levenshtein distance , distance greater 2, want exclude matches these 2 variations happen next each other.
an example:
if trying match against string "james madison",
-"jame madisan" produce match distance=2
-"jans madison" have distance=2 not produce hit because of 2 consecutive variations ("n" needs changed "m" , "e" must inserted before "s" in "james")
Comments
Post a Comment