Build 3.70.0

Multiple settings for each handled object

Duplicate Settings now support multiple configurations. You can define a dedicated scope for each configuration using advanced rules in the Record Filter section.

Each configuration runs its own full duplicate assessment, and the results appear separately in the Duplicate Data Steward View.

To create a new Duplicate Setting, click Add New, choose the target object, and decide whether to start from the Delpha default template for detection and merge settings. You can also duplicate an existing configuration using the Clone option.

Detection Settings

General Settings

Setting

Status/Value

Explanation

Activate Duplicate Detection

Active (Toggle On)

This is the master switch. Duplicate detection is currently enabled and running based on the configured rules.

Run Evaluation

Evaluate (Button)

This is the action button used to manually trigger the duplicate detection process. When clicked, the system scans the records and flags potential duplicates according to the rules and threshold.

Detection Threshold

50

This sets the minimum match score for a pair of records to be flagged as potential duplicates. Pairs must have a score (on a likely 0-100 scale) to be surfaced for review. A lower number flags more potential duplicates.

Record Filter

The goal of the Record Filter is to limit the scope of the duplicate scan, ensuring the system only checks records that meet specific criteria you define.

1. Filter Logic

  • Filter Logic: This area allows you to combine multiple individual conditions using boolean operators (AND, OR, NOT) to create complex filtering rules.

  • e.g., 1 AND (2 OR 3): This example illustrates how the logic works. It means:

    • Condition 1 must be true, AND

    • Either Condition 2 OR Condition 3 must be true.

    • The numbers (1, 2, 3) refer to the individual conditions defined below.

2. Adding Conditions

  • Search Input Field: You use this to find and select a field name from your database (e.g., Status, Creation Date, Region).

  • Add Condition Button: Once you select a field, you click this button to add a new condition row.

  • Condition Rows (implied below the image): Each row will typically let you:

    1. Select a Field (e.g., Record Status).

    2. Select an Operator (e.g., Equals, Is Not Null, Is Greater Than).

    3. Enter a Value (e.g., Active).

Example Use Case:

You might set up a filter to only check records where:

  1. Status Equals Customer

  2. AND Last Modified Date Is Greater Than 90 days ago

This filter ensures the duplicate detection process only runs on active customer records that have been modified recently, saving processing time.

Filtering Rules

The purpose is to apply specific rules before potential duplicates are reviewed and merged.

The rules are divided into different Rule Types that trigger specific actions:

Rule Type

General Purpose

Action Taken

EXACT MATCH

Used to automatically flag a pair as a certain duplicate if they have an exact match on high-certainty identifiers.

If the specified conditions are met, the system automatically considers them as perfect duplicates. The score is set to the maximum (100), the Status is set to Auto Yes and a specified comment is added. This can prepare them for automatic or priority merging.

DISCARD

Used to prevent a pair of records from being considered duplicates if they match a specific rule.

If the values in the specified field(s) match, the system discards the pair. The Status is set to Auto No. The pair is not proposed to the Delpha User and can only be processed (accepted or rejected) by the Data Steward.

KEEP

Used to automatically flag a a pair by adding a comment if they match a specific rule.

When a pair of records meets the criteria of a KEEP rule, the system retains the pair in the set of potential duplicates and add a specific comment.

Matching Fields

The Matching Fields configuration, is the core mechanism used by the application to calculate the numerical Duplicate Score for any pair of records. This section defines what fields matter and how much they matter when determining if two records are duplicates.

These fields are involved in the score calculation.

Advanced Settings

This screen shows the Advanced Settings for duplicate detection. This section provides granular control over how the system initially screens for duplicates, cleans data for scoring, and manages duplicate creation in real-time.

1. Screening Fields

Purpose is to quickly filter the large pool of records down to a smaller, more relevant subset before the full, weighted matching score calculation (from the "Matching Fields" section) is performed. This significantly improves performance. The system performs a fast initial match (relying on Name). If the name match is inconclusive, it uses these Screening Fields (e.g., ZIP Code, First Letter of Company Name) to narrow down which records should proceed to the full scoring calculation.

2. Discard Placeholder Values

Purpose is to prevent "dummy" or default values in your records from artificially inflating or skewing the duplicate match score.

You can enter common placeholders that should be ignored into the input field or directly select them from the proposed values.

3. Duplicates at Creation

Define the system's immediate response when a user attempts to create a new record that matches an existing record. This is a crucial real-time defense against data decay.

You choose an option from the dropdown menu to determine what action the system takes:

Option

System Action

Outcome

Allow (Async Detection)

(Default/Least Restrictive) The record is created immediately.

The duplicate check runs after creation, and the record is flagged for later review.

Block (Prevent Creation)

(Most Restrictive) The system performs a real-time check. If an exact match is found, it stops the creation process and forces the user to resolve the conflict (e.g., update the existing record).

Creation is prevented, ensuring no new duplicates enter the system.

Last updated

Was this helpful?