Reputation: 1541
I have two csv files A and B. A is the master repository. I need to read those files, map the records of B to A and save the mapped records to another file. The class to hold records is, say Record. The class to hold the matched records is, say, RecordMatch.
class Record
{
string Id;
string Name;
string Address;
string City;
string State;
string Zipcode;
}
class RecordMatch
{
string Aid;
string AName;
string Bid;
string BName;
double NameMatchPercent;
}
The mapping scenario goes thus : First, against each record of B, the records of A are filtered using state, city and then zipcode. The records of A thus filtered are then compared with the record of B. This comparison is between the name field, and is a best-match comparison using a fuzzy string algorithm. The best match is selected and saved.
The string matching algorithm will give a percentage of match. Thus, the best result out of all the matches have to be selected.
Now that I tried my best to explain the scenario, I will come to the design issue. My initial design was to make a Mapper class, which will be something as below :
class Mapper
{
List<Record> ReadFromFile(File);
List<Record> FilterData(FilterType);
void Save(List<Record>);
RecordMatch MatchRecord(Record A, Record B);
}
But looking at the design, it simply seems to be a class wrapper over some methods. I dont see any OO design in it. I also felt that the Match() belongs more to the Record class than the Mapper class.
But on another look, I saw the class as implementing something resembling to Repository pattern.
Another way I think is to keep the Mapper class, and just move the Match() method to the Record class, something like this :
class Mapper
{
List<Record> ReadFromFile(File);
List<Record> FilterData(FilterType);
void Save(List<Record>);
}
class Record
{
string id;
string name;
string address;
// other fields;
public RecordMatch Match (Record record)
{
// This record will compare the name field with that of the passed Record.
// It will return RecordMatch specifyin the percent of match.
}
}
Now I am totally confused in this simple scenario. What would ideally be a good OO design in this scenario?
Upvotes: 3
Views: 453
Reputation: 2046
I gave this a try. There's not so much you can do when it comes to OO principles or design patterns I think, except for maybe using composition for the MatchingAlgorithm (and perhaps Strategy and Template if needed). Here's what I've cooked up:
class Mapper {
map(String fileA, String fileB, String fileC) {
RecordsList a = new RecordsList(fileA);
RecordsList b = new RecordsList(fileB);
MatchingRecordsList c = new MatchingRecordsList();
for(Record rb : b) {
int highestPerc = -1;
MatchingRecords matchingRec;
for(Record ra : a) {
int perc;
rb.setMatchingAlgorithm(someAlgorithmYouVeDefined);
perc = rb.match(ra);
if(perc > highestPerc) {
matchingRec = new MatchingRecords(rb, ra, perc);
}
}
if(matchingRec != null) {
c.add(matchingRec);
}
}
c.saveToFile(fileC);
}
}
class MatchingAlgorithm {
int match(Record b, Record a) {
int result;
// do your magic
return result;
}
}
class Record {
String Id;
String Name;
String Address;
String City;
String State;
String Zipcode;
MatchingAlgorithm alg;
setMatchingAlgorithm(MatchingAlgorithm alg) {
this.alg = alg;
}
int match(Record r) {
int result; -- perc of match
// do the matching by making use of the algorithm
result = alg.match(this, r);
return result;
}
}
class RecordsList implements List<Record> {
RecordsList(file f) {
//create list by reading from csv-file)
}
}
class MatchingRecords {
Record a;
Record b;
int matchingPerc;
MatchingRecords(Record a, Record b, int perc) {
this.a = a;
this.b = b;
this.matchingPerc = perc;
}
}
class MatchingRecordsList {
add(MatchingRecords mr) {
//add
}
saveToFile(file x) {
//save to file
}
}
(This is written in Notepad++ so there can be typos etc; also the proposed classes can surely benefit from a little more refactoring but I'll leave that to you if you choose to use this layout.)
Upvotes: 1
Reputation: 1636
Amusingly enough, I am working on a project almost exactly like this right now.
Easy Answer: Ok, first off, it is not the end of the world if a method is in the wrong class for a while! If you have your classes all covered with tests, where the functions lives is important, but can be changed around fluidly as you, the king of your domain, sees fit.
If you are not testing this, well, that would be my first suggestion. Many many smarter people than me have remarked on how TDD and testing can help bring your classes to the best design naturally.
Longer Answer: Rather than looking for patterns to apply to a design, I like to think it through like this: what are the reasons each of your classes has to change? If you separate those reasons from each other (which is one thing TDD can help you do), then you will start to see design patterns naturally emerge from your code.
Here are some reasons to change I could think of in a few passes reading through your question:
Ok, so, if implementing any of those would make you need to add an "if statement" somewhere, then perhaps that is a seam for a subclasses implementing a common interface.
Also, let's say you want to save the created file in a new place. That is one reason to change, and should not overlap with you needing to change your merging strategy. If those two parts are in the same class, that class now has two responsibilities, and that violates the single responsibility principle.
So, that is a very brief example, to go further in depth with good OO design, check out the SOLID principles. You can't go wrong with learning those and seeking too apply them with prudence throughout your OO designs.
Upvotes: 4