Home > SSIS > SSIS Remove Duplicate Rows Using Fuzzy Grouping (SSIS Transformations)

SSIS Remove Duplicate Rows Using Fuzzy Grouping (SSIS Transformations)

Source data may have the duplicate rows which needs to be removed as part of data cleansing task.

Fuzzy Grouping is one of the transformation in Data Flow Transformations which can be used to group the similar rows in the source stream line. This transformation performs data cleaning tasks by identifying rows of data that are likely to be duplicate.

Lets generate some of duplicate records as shown in below screen:-

Source Query

Create one DataFlow task inside which Drag OLE DB Source component and write the query shown above which will become as source data which has duplicate records.

OLEDBSource

Drag the Fuzzy Group component and open the editor.

Click here to read more https://knowmsbi.blogspot.com/2020/04/ssis-remove-duplicate-rows-using-fuzzy.html

Thanks
Prasad

Learn Different BI Tools
Subscribe my YouTube Channel : BI Tools

Categories: SSIS Tags: ,
  1. Jay
    March 9, 2013 at 2:23 am

    Hi! This is one of the best explanations of how to use th eFuzzy Grouping so far! Thanks!

  2. Ram
    March 11, 2013 at 3:37 pm

    •Pass Through : When this property selected against the column that column will not take part in fuzzy grouping algorithm.

    Other Properties Like

    •Input Column: Columns listed which are taking place in fuzzy grouping.

    Here , the Pass Through value ‘name’ column is selected and column will not take in fuzzy grouping algorithm. But , the same column ‘name’ shows in other properties section . As per the above description , this column will take place in fuzzy grouping . This is conflicting above statement .

    Please explain

  3. Maruthi
    June 26, 2017 at 12:31 pm

    Thanks For explanation

  1. October 8, 2012 at 9:35 am

Leave a comment