All matching types are case in-sensitive, as we lowercase both your input and list records when doing the matching.
β
You can learn more about data pre-processing we do here.
Fuzzy matching
Fuzzy matching allows for slight variations and misspellings in names. This matching generates results with a fuzziness percentage indicating how closely the names match. You can see an example hit here.
Several different algorithms are used at different times to achieve the best matching results. You can read more about the algorithms below.
Fuzziness threshold can be any value between 80 and 100, however, the default and recommended value is 95.
In screening process' search step, we use the following algorithms:
Levenshtein Distance: This algorithm allows for controlled fuzziness in matches, adjusting the permissible edit distances based on word length.
Soundex: This algorithm focuses on phonetic similarities, matching names that sound alike even if they are spelled differently.
Elastic's Relevance Algorithm: This algorithm evaluates the relevance of matches, considering factors such as name length, uniqueness, and the number of matches.
In scoring step, we use the following algorithm:
Jaro-Winkler scoring algorithm is used to quantify the similarity between data entries.
Fuzzy matching is ideal for screening names, such as company_name
, sender_name
, and similar fields where minor discrepancies are common.
Prefix matching
This matching logic looks for an exact match of a prefix (from the start of a token/namepart), i.e. selecting length of 8 means that 8 first characters must match.
NB! When using Dow Jones lists, prefix matching type only checks bank BIC codes. This means that prefix matching is currently limited to Dow Jones BIC information. However, with Custom lists, there are no such limitations.
Prefix matching is best suited for scenarios like screening bank BIC codes, where initial characters need to match precisely.
Exact matching
Every word from your input (screened field) need to match words from the target (list records), for example:
example: target - Salvadore Criminale, input - Salvadore von Maria de Criminale - no match, as "von Maria de" are not in the target.
example: target - Salvadore von Maria de Criminale, input - Salvadore Criminale - match, as all input words are in the target and match exactly.
As the name suggests, this matching logic looks for exact matches, i.e. no fuzziness.
Order of screened fields does not matter, i.e.
John Smith
will match withSmith John
.NB! When using Dow Jones lists, the exact matching type only checks bank names. This means that exact matching is currently limited to Dow Jones bank name information. However, with Custom lists, there are no such limitations.
Exact matching is useful when absolute precision is required, such as when dealing with exact identifiers or cases where any variation is unacceptable.
Contains matching
Binary Relevance: This algorithm ensures that every token from the list is present in the input. However, some variations are allowed using:
Levenshtein Distance: Allows for controlled fuzziness, similar to what is used in fuzzy matching, but with different settings.
Example:
Target: Salvadore Criminale
Input: Salvadore von Maria de Criminale β match, as all target words are in the input.
Input: Salvadore von Maria de Criminalist β no match, as "Criminale" is not in the input.
Once the matching tokens (words) are identified, Contains logic applies similarity score (using the same algorithm as Fuzzy Matching) to assess how closely the input matches the target. If the similarity score falls below the defined threshold, the match is filtered out.
Example:
Target: Abu Hali
Input: Anu von Maria Tali
The matched portion is extracted β Abu Hali vs. Anu Tali
A similarity score is calculated (e.g., 82).
If the score is below the threshold (e.g., 95), no alert is generated, as the names are not sufficiently similar.
Contains matching is usually used for reference
field screening in Transaction Screening Flows. Using Contains matching ensures that important list record names (e.g. name of a sanctioned person) are included in the reference field (or other similar fields), even if the field includes extraneous or unrelated information.
Fuzzy contains matching
Fuzzy contains combines the logic of both fuzzy matching and contains matching, meaning both methods are applied to see if either yields a match. This approach is particularly effective for handling imperfect or 'dirty' data. However, since it uses two matching methods, it may generate a higher number of alerts compared to other types of matching.
If results are produced from both matching methods, we consolidate those hits into a single alert.
The same fuzzy matching threshold applies to both the fuzzy and contains parts of the matching logic.
Fuzzy contains matching is recommended for name screening when data quality is not optimal and may include additional or extraneous words. This matching type is ideal for scenarios where names might be surrounded by extra text or inconsistently formatted, ensuring that key names from the list are accurately detected even with minor variations or added words.