Merge Near Duplicate Access Files

Question

I'm working with an old Access database (.mdb) that has been duplicated by a malfunction in OneDrive. I have users adding records (one at a time) to the database both on a laptop and a desktop. On occasion the laptop user is trying to save the database to OneDrive while they aren't online, then the desktop user opens the old version of the database and starts making changes. The end result is OneDrive creates two versions of the .mdb file: the original and a new version from the laptop. I'm trying to merge the two versions back into one assuming the users are both adding data (and not deleting it).

Would the best approach be to import the laptop database into the original and run a merge query for each table and then de-duplicate it? Or is there a better way?

ASH · Accepted Answer

Obviously, near duplicates and duplicates are two totally different things. Finding, and eliminating dupes is pretty easy. Finding near dupes is much harder! SQL has a 'Like' statement, but it's not going to find differences such as 'Microsoft' and 'Micorsoft'. Take a look at the link below for one possible solution.

http://www.accessmvp.com/TomVanStiphout/Simil.htm

If that doesn't work for you, you'll need to consider an alternative, and perhaps Python could do it for you. If you have only a few tables with 'issues' you could export the affected tables, clean them using another technology, like Python, and then import the 'cleaned' data sets. See the link below for an some ideas.

https://bergvca.github.io/2017/10/14/super-fast-string-matching.html

Also, R has some capabilities for finding near matches, and like Python, R is totally free. See the link below for some additional ideas of how to proceed.

https://github.com/ColinFay/tidystringdist

I'm not sure Access the the best tool for this kind of thing, but with an open mind, and a little effort, you can definitely do what you want to do.

Merge Near Duplicate Access Files

Answers (1)

Related Questions