What are the Data Quality dimensions for Contact Name
Learn how Delpha evaluates Name fields using six data quality dimensions: completeness, validity, uniqueness, consistency, accuracy, and timeliness.
Understanding Delpha’s Data Quality Dimensions for Name
Delpha evaluates each field using key data quality dimensions. Below is how each dimension applies to Name field, helping users interpret scores and drive cleanup efforts.
Data Quality Dimensions Explained
Completeness
Question: Are both first and last names present?
Computation: 1 if both are present, 0 if either is missing or empty.
Validity
Question: Do both names conform to valid character and pattern rules?
Computation: 1 if both are valid, 0 if either is invalid.
Example:
Input:
Jean
→ ValidInput:
J3an!
→ Invalid
Uniqueness - NA
Consistency
Computation: 1 if both first and last names are unchanged after normalization, 0 otherwise.
Example:
Input:
Jean
→ Normalized:Jean
→ Consistency: 1Input:
Jéan
→ Normalized:Jean
(if accents removed) → Consistency: 0
Accuracy
Question: How likely is the name to be correct (not reversed or misspelled)?
Reversed Names Detection
Purpose: Detects if the first and last names are likely swapped.
Method:
Uses Bayesian/probabilistic scoring based on the frequency of each name as a first or last name.
If the reversed score exceeds a threshold (e.g., 0.7), the names are considered reversed.
Edge Cases:
Handles ambiguous names (common as both first and last names) with probabilistic logic.
Misspelled Names Detection
Purpose: Identifies likely misspellings in either name.
Method:
Uses phonetic algorithms (e.g., Match Rating Codex, NYSIIS, Beider-Morse) and string comparison against a database of common names.
If either name is likely misspelled, the record is flagged.
Edge Cases:
Handles accented, Unicode, and strongly normalized forms for robust detection
3. Consistency
Timeliness
Related to the last assessment date.
Last updated
Was this helpful?