Reputation: 110539
This is for a small scheduling app. I need an algorithm to efficiently compare two "schedules", find differences, and update only the data rows which have been changed, as well as entries in another table having this table as a foreign key. This is a big question, so I'll say right away I'm looking for either general advice or specific solutions.
EDIT: As suggested, I have significantly shortened the question.
In one table, I associate resources with a span of time when they are used.
I also have a second table (Table B) which uses the ID from Table A as a foreign key.
The entry from Table A corresponding to Table B will have a span of time which subsumes the span of time from Table B. Not all entries in Table A will have an entry in Table B.
I'm providing an interface for users to edit the resource schedule in Table A. They basically provide a new set of data for Table A that I need to treat as a diff from the version in the DB.
If they completely remove an object from Table A that is pointed to by Table B, I want to remove the entry from Table B as well.
So, given the following 3 sets:
I need an algorithm that will:
Just sorting the objects into an arrangement where I can apply the appropriate database operations is more than adequate for a solution.
Again, please answer as specifically or generally as you like, I'm looking for advice but if someone has a complete algorithm that would just make my day. :)
EDIT: In response to lassvek, I am providing some additional detail:
Table B's items are always contained entirely within Table A items, not merely overlapping.
Importantly, Table B's items are quantized so they should fall either entirely within or entirely outside. If this doesn't happen, then I have a data integrity error that I'll have to handle separately.
For example (to use a shorthand):
Table A ID Resource Start End 01 Resource A 10/6 7:00AM 10/6 11:00AM 02 Resource A 10/6 1:00PM 10/6 3:00PM Table B ID Table_A_ID Start End 01 02 10/6 1:00PM 10/6 2:00PM
So I want the following behaviours:
Upvotes: 2
Views: 525
Reputation: 391734
I have worked extensively with periods, but I'm afraid I don't understand entirely how table A and B work together, perhaps it's the word subsume that I don't understand.
Can you give some concrete examples of what you want done?
Do you mean that timespans recorded in table A contains entirely timespans in table B, like this?
|---------------- A -------------------|
|--- B ----| |--- B ---|
or overlaps with?
|---------------- A -------------------|
|--- B ----| |--- B ---|
or the opposite way, timespans in B contains/overlaps with A?
Let's say it's the first one, where timespans in B are inside/the same as the linked timespan in table A.
Does this mean that:
* A removed A-timespan removes all the linked timespans from B
* An added A-timespan, what about this?
* A shortened A-timespan removes all the linked timespans from B that now falls outside A
* A lenghtened A-timespan, will this include all matching B-timespans now inside?
Here's an example:
|-------------- A1 --------------| |-------- A2 --------------|
|---- B1 ----| |----- B2 ---| |---- B3 ----| |-- B4 --|
and then you lengthen A1 and shorten and move A2, so that:
|-------------- A1 ---------------------------------| |--- A2 --|
|---- B1 ----| |----- B2 ---| |---- B3 ----| |-- B4 --|
this means that you want to modify the data like this:
1. Lengthen (update) A1
2. Shorten and move (update) A2
3. Re-link (update) B3 from A2 to A1 instead
how about this modification, A1 is lengthened, but not enough to contain B3 entirely, and A2 is moved/shortened the same way:
|-------------- A1 -----------------------------| |--- A2 --|
|---- B1 ----| |----- B2 ---| |---- B3 ----| |-- B4 --|
Since B3 is now not entirely within either A1 or A2, remove it?
I need some concrete examples of what you want done.
Edit More questions
Ok, what about:
|------------------ A -----------------------|
|------- B1 -------| |------- B2 ------|
|---| <-- I want to remove this from A
What about this?
Either:
|------------------ A1 ----| |---- A2 -----|
|------- B1 -------| |B3| |--- B2 ---|
or:
|------------------ A1 ----| |---- A2 -----|
|------- B1 -------|
To summarize how I see it, with questions, so far:
I'll work on an implementation in C# that might work when I get home from work, I'll come back with more later tonight.
Edit Here's a stab at an algorithm.
You should create a massive set of unit tests and make sure you cover all combinations of modifications.
Upvotes: 7
Reputation: 20621
I suggest you decouple your questions into two separate ones: The first should be something like: "How do I reason about resource scheduling, when representing a schedule atom as a resource with start time and end time?" Here, ADept's suggestion to use interval algebra seems fitting. Please see The Wikipedia entry 'Interval Graph' and The SUNY algorithm repository entry on scheduling. The second question is a database question: "Given an algorithm which schedules intervals and indicate whether two intervals overlap or one is contained in another, how do I use this information to manage a database in the given schema?" I believe that once the scheduling algorithm is in place, the database question will be much easier to solve. HTH, Yuval
Upvotes: 2
Reputation: 6782
It seems to me like any algorithm for this will be involve a pass through NewA, matching ResourceID, StartTime, and EndTime, and keeping track of which elements from OldA get hit. Then you have two sets of non-matching data, UnmatchedNewA and UnmatchedOldA.
The simplest way I can think of to proceed is to basically start over with these: Write all of UnmatchedNewA to the DB, transfer elements of B from UnmatchedOldA into New A keys (just generated) where possible, deleting when not. Then wipe out all of UnmatchedOldA.
If there are a lot of changes, this is certainly not an efficient way to proceed. In cases where the size of the data is not overwhelming, though, I prefer simplicity to clever optimization.
It's impossible to know whether this final suggestion makes any sense without more background, but on the off chance that you didn't think of it this way:
Instead of passing the entire A collection back and forth, could you use event listeners or something similar to update the data model only where changes ARE needed? This way, the objects being altered would be able to determine which DB operations are required on the fly.
Upvotes: 1
Reputation: 29163
As I understand you, your users can only directly affect table A. Assuming you are programming in C#, you could use a simple ADO.Net DataSet to manage modifications to table A. The TableAdapter knows to leave untouched rows alone and to handle new, modified and deleted rows appropriately.
In addition you should define a cascading delete in order to automatically remove corresponding objects in table B.
The only case that is not handled this way is if a timespan in table A is shortened s.t. it does not subsume the corresponding record in Table B anymore. You could simply check for that case in an update stored procedure or alternatively define an update-trigger on table A.
Upvotes: 1
Reputation: 5552
You post is almost in the "too long; didnt read" category - shortening it will probably give you more feedback.
Anyway, on topic: you can try lookin into a thing called "Interval Algebra"
Upvotes: 1