Reputation: 339
I have litte knowledge in data transformation in SSIS and basically I am learning all by myself.
II have learned some of them and now I am into Fuzzy logic.
I am getting stuck in Fuzzy grouping and Fuzzy Lookup
in SSIS.
I cannot figure out how to do that though some google search gave me some result which are beyond my capability .
Could any one please suggest me some step by step tutorial for implementing the same .
It would be great if the example contains diagrams so that I can easily learn.
Also in which case should I go with it(I mean a real time scenario)
Thanks in advance
Upvotes: 1
Views: 6988
Reputation: 2221
Here is a good start for you to actually understand what the fuzzy lookup component does (Similar to the fuzzy grouping) : SSIS fuzzy lookup
I actually used this at a client where I was receiving their client data that was fat fingered in by someone. I created a static lookup table based on company names:
Lku Table (notice how these are the same at the beginning)
Name | Lookup Output Name
Microsoft | Microsoft
JP Morgan Chase | JP Morgan Chase
McDonalds | McDonalds
I would receive data in a text file that looked like this:
Typed Name
Microsft
JP Morgan
McDons
Using the fuzzy lookup, I would join on the Name column(dont forget this is case sensitive-user upper or lower to cast) to get the lookup ouput name. I set the similiarity threshold to about 80% (recommended percent or higher). I would then view my matchups via the data viewer which might look like this:
Typed Name | Lookup Name | Confidence | Similarity
Microsoft | Microsoft | 100% | 100%
JP Morgan | JP Morgan Chase | 88% | 90%
McDons | McDonalds | 60% | 50%
Then based on a conditioal split, I loaded the ones with both a confidence and similiarity percent > 80% and less then < 100% into the lookup table and loaded the others into an error table. An email was then emailed if the count was greater then one in the error table. So the result lookup table would be something like this:
Look Up Table
Name | Lookup Output Name
Microsoft | Microsoft
JP Morgan Chase | JP Morgan Chase
McDonalds | McDonalds
JP Morgan | JP Morgan Chase
Error Table
Name | Proposed Name | Error message
McDons | McDonalds | Confidence was 60% and Similarity was 50%
Hope this helped.
Upvotes: 3