Reputation: 121
I'm developing a program that scrapes the web for certain data and feeds it back to the database. The problem is that I don't want duplicate entries of the same data as soon as the crawlers run for a second time. If some attributes changed, but the majority of the data is still the same, I'd like to update the DB entry rather than simply adding a new one. I know how to do this in code, but I was wondering if this could be done better.
The way the update works right now:
//This method calls several other methods to check if the event in question already exists. If it does, it updates it using the id it returns.
//If it doesn't exist, -1 is returned as an id.
public static void check_event(Event event)
{
int id = -1;
id = check_exact_event(event); //Check if an event exists with the same title, location and time.
if(id > 0)
{
update_event(event, id);
Logger.log("EventID #" + id + " found using exact comparison");
return;
}
id = check_similar_event_titles(event); //Check if event exists with a different (but similar) title
if(id > 0)
{
update_event(event, id);
Logger.log("EventID #" + id + " found using similar title comparison");
return;
}
id = check_exact_image(event); //Check if event exists with the exact same image
if(id > 0)
{
update_event(event, id);
Logger.log("EventID #" + id + " found using image comparison");
return;
}
//Otherwise insert new event
create_new_event(event);
}
This works, but it's not very pleasing to the eye. What's the best way to go about this?
Upvotes: 3
Views: 89
Reputation: 8640
Personally i can'tsee anything wron with your code, it is clean and effective. If you really want to change it, you could do it in single if statement
public static void check_event(Event event) {
int id = -1;
if ((id = check_exact_event(event)) > 0
|| (id = check_similar_event_titles(event)) > 0
|| (id = check_exact_image(event)) > 0) {
update_event(event, id);
}
;
create_new_event(event);
}
But i cant see much gain in this way
Upvotes: 3