What are the Data Quality dimensions for Contact Name

Learn how Delpha evaluates Name fields using six data quality dimensions: completeness, validity, uniqueness, consistency, accuracy, and timeliness.

Name analysis relies on the standard Contact.Name field

Understanding Delpha’s Data Quality Dimensions for Name

Delpha evaluates each field using key data quality dimensions. Below is how each dimension applies to Name field, helping users interpret scores and drive cleanup efforts.

Data Quality Dimensions Explained

Completeness

Question: Are both first and last names present?

Computation: 1 if both are present, 0 if either is missing or empty.

Validity

Question: Do both names conform to valid character and pattern rules?

Computation: 1 if both are valid, 0 if either is invalid.

Example:

  • Input: Jean → Valid

  • Input: J3an! → Invalid

Uniqueness - NA

Consistency

Question: Do the original and normalized names match?

Computation: 1 if both first and last names are unchanged after normalization, 0 otherwise.

Example:

  • Input: Jean → Normalized: Jean → Consistency: 1

  • Input: Jéan → Normalized: Jean (if accents removed) → Consistency: 0

Accuracy

Question: How likely is the name to be correct (not reversed or misspelled)?

  • Reversed Names Detection

    • Purpose: Detects if the first and last names are likely swapped.

    • Method:

      • Uses Bayesian/probabilistic scoring based on the frequency of each name as a first or last name.

      • If the reversed score exceeds a threshold (e.g., 0.7), the names are considered reversed.

    • Edge Cases:

      • Handles ambiguous names (common as both first and last names) with probabilistic logic.

  • Misspelled Names Detection

    • Purpose: Identifies likely misspellings in either name.

    • Method:

      • Uses phonetic algorithms (e.g., Match Rating Codex, NYSIIS, Beider-Morse) and string comparison against a database of common names.

      • If either name is likely misspelled, the record is flagged.

    • Edge Cases:

      • Handles accented, Unicode, and strongly normalized forms for robust detection

      3. Consistency

Timeliness

Related to the last assessment date.

Last updated

Was this helpful?