How Contact Name Normalization works?
Normalization Strategies
Name normalization is modular and configurable. Each strategy can be set independently, and multiple strategies can be combined for robust normalization. The main strategies are:
Casing Strategy
Character Strategy
Spacing Strategy
Block Removal Strategy
Each strategy is described below, with possible values and examples.
Casing Strategy (default: "identity"
)
"identity"
)Specifies how the casing of the name should be normalized.
identity
No changes to the casing.
Jean
Jean
uppercase
Converts all characters to uppercase.
jean
JEAN
lowercase
Converts all characters to lowercase.
Jean
jean
capitalize
Capitalizes the first letter of each word.
jean smith
Jean Smith
name
Capitalizes the first letter of each word and handles name particles.
jean mcdoNALD
Jean McDonald
Character Strategy (default: "identity"
)
"identity"
)Defines the strategy for character validation and filtering.
identity
No changes to the characters.
Je1an-朙-李@#.
Je1an-朙-李@#.
latin-name
Keeps only Latin alphabet (with accents) and valid name punctuation. Removes symbols, numbers, emojis, etc.
Je1an-朙.#️⃣
Jean-
universal-name
Keeps all alphabets (Latin, Cyrillic, Chinese, etc.) and valid name punctuation. Removes symbols, numbers, emojis.
Je1an-朙-李.#️⃣
Jean-朙-李
Spacing Strategy (default: "identity"
)
"identity"
)Specifies how spacing should be normalized in the name.
identity
No changes to spacing.
Jean
Jean
trim
Trims leading and trailing spaces.
" Jean "
"Jean"
normalize
Replaces multiple spaces with a single space.
" Jean Smith "
" Jean Smith "
clean
Combines both trim
and normalize
.
" Jean Smith "
"Jean Smith"
Block Removal Strategy (default: "identity"
)
"identity"
)Defines how blocks of text (such as titles, text within parentheses, or text after a comma) should be removed.
identity
No block removal.
Dr. Jean Smith (Ph.D), France
Dr. Jean Smith (Ph.D), France
remove
Removes titles, text within parentheses, and text after a comma.
Dr. Jean Smith (Ph.D), France
Jean Smith
Combining Strategies
You can combine any of the above strategies to achieve the desired normalization. For example, to strongly normalize a name, you might use:
casing_strategy = "name"
char_strategy = "latin-name"
spacing_strategy = "clean"
block_removal_strategy = "remove"
Examples
Casing Strategy
uppercase
: jean → JEANcapitalize
: jean smith → Jean Smithname
: jean mcdoNALD → Jean McDonald
Character Strategy
latin-name
: Je1an-朙.#️⃣ → Jean-universal-name
: Je1an-朙-李.#️⃣ → Jean-朙-李
Spacing Strategy
trim
: " Jean " → "Jean"normalize
: " Jean Smith " → " Jean Smith "clean
: " Jean Smith " → "Jean Smith"
Block Removal Strategy
remove
: Dr. Jean Smith (Ph.D), France → Jean Smith
Last updated
Was this helpful?