sql - Combine customer primary keys -
i have 4 systems gathered in 1 dwh , has come joining different customer primary keys. 1 actual customer may present in 4 systems or 1 of them. see preferable solution doing business logic lookup? links week regular joins work small part of customers. there ssis work flow can use or how approach this?
i work large health insurance company , had similar project. had several source systems came different business units contained member information. have large member population , members may in multiple systems. data in systems so in terms of quality: each system may or may not have valid ssns, medicare ids, current and/or spelled address information, valid birth dates, etc. our goal create unique identifier pair source system pks uniquely identify member across many systems.
the result more of process single piece of code or ssis task. our process keeps track of every member ever comes through our system, snapshot of primary fields identify them (name, address, ssn, medicare id, etc). on each build, add new members haven't come our system before update history table. go through compares , try find out members match. attempt full matches (firstname = firstname, lastname = lastname, ssn = ssn) possible, if matches fail, have use partial matching system. typically try match identifier, such ssn or medicare id, parts of name , address. if don't have valid ssn or medicare id, end using partial matches on fields have available us.
the process bit of trial , error. after our first release, users found sorts of instances same member had different ids (our matching logic had missed valid match), or members not same had same unique identifier (our match logic produced bad match). after each release, typically write code based on errors find , typically fixes of them. point we've matched reliably can.
in case, i'm guessing don't have government issued ids or ssns available you. i'd try straight join
on info have (name, shipping address, billing address, email, credit card numbers, etc). after that, move partial matches. use left 4 characters of fields have in cases can't straight join
. after that, depending on quality of data, may have leg work rest of logic down. i'd try @ customers couldn't find match , see if of them have same email, or similar addresses, or similar names. if so, compare these members , see causing match not work.
edit: if data continuing updated, keep in mind you'll have compare new records identified customers see if happen match customer in system.
best of luck you. our project bit of headache , still isn't done, it's in shape , continue improve each run do.
Comments
Post a Comment