Fuzzy string matching

The big picture: Fuzzy string matching

text data
fuzzy matching
stringdist

An overview over fuzzy string matching problems and solutions. After reading this article you will know in which situations fuzzy string matching can be helpful, and know variations like fuzzy deduplication and record linkage.

Fuzzy matching packages

text data
fuzzy matching
stringdist

Which packages help us with fuzzy matching? We are going to explore stringdist, tidystringdist, fuzzyjoin, inexact, refinr, fuzzywuzzyR, and lingmatch.

Fuzzy matching example with company names

text data
fuzzy matching
stringdist

Whenever you have text data that was input manually by a human, there is a chance that it contains errors: Typos, abbreviations or different ways of writing can be challenges for your analysis. Fuzzy matching is a way to find inexact matches that mean the same thing like mcdonalds, McDonalds and McDonald's Company.

Tracking COVID in Germany

covid
germany
fuzzy matching
spatial

Animations can help to show events over time. I found data from the RKI about daily COVID cases in Germany and want to describe the process of creating the animation. It involves fuzzy matching, as the names of the counties (Landkreise) are not identical in the RKI data and the shapefile I used.

More articles »

Fuzzy string matching