Reputation: 81
I have an array of structs where they are sorted by ID and there are duplicate entries of that ID within the array. Each struct in the array has a number of points assosiated with it and I want to find the total number of points for each ID. I want to remove any duplicates and store their total points value in a single struct reducing the size of my array.
typedef struct boat_data {
int ID;
int time_to_complete_race; //This can be ignored
int points;
} boat_node;
typedef boat_node boat_ptr;
The current code that I have made doesn't seem to work as intended. tot_boats
is the number of boats and tot_members
is the number of members that have been found (by this I mean the number of non-duplicate ID's present). I have two array structs where the final_boat_scores
is of the size of the number of members present and I want to store the ID
value and the points
value
for(int boat = 0; boat < (total_boats - tot_members); boat++) {
for (int next_boat = 0; next_boat < (total_boats - tot_members); next_boat++) {
if (boat_scores[boat].ID == boat_scores[next_boat].ID) {
final_boat_scores[boat].ID = boat_scores[next_boat].ID;
final_boat_scores[boat].points += boat_scores[next_boat].points;
break;
}
}
}
Upvotes: 1
Views: 1421
Reputation: 1922
More and more data renders the sorting-and-removing duplicates increasingly unfeasible, (though this may take a while.) One is describing a set with equality determined by id
. It is a very common data structure; for example, in relational databases, the id
would be your key. Instead of de-duplicating every time, the set does not allow duplicates in the first place. A common implementation is a hash set implemented as a hash map from the keys, (in this case, ID
,) to a sentinel value that indicates the presence of the key, (any char
or int
will do.) A static set has a very good C
implementation in gperf, which creates a minimal perfect hash, but I believe you want to have dynamic content, (which would translate to allowing other competitors to join the club.)
Since one key is a number, it's fairly easy to create a hash function from the projection,
int hash(const struct boat_data *const b) {
return b->ID;
}
A lot of languages have support for hash maps in their standard library, (for example, JavaScript version of your question,) but C does not. However, one will find a lot of implementations. See a Quick Way to Implement Dictionary in C. Also, uthash, Android (uses void *
key,) Git, statsd hashmap (uses strings,) GHash, HMap.
If the ID
is bounded, (and within the range of computability,) it's simple to create a (not minimal) perfect hash function.
#include <stdlib.h> /* EXIT */
#include <stdio.h> /* printf */
static unsigned points_by_id[1000];
static size_t id_size = sizeof points_by_id / sizeof *points_by_id;
int main(void) {
size_t i;
/* First race between [45 36, 10]. */
points_by_id[45] += 45;
points_by_id[36] += 20;
points_by_id[10] += 100;
/* Second race between [10, 12, 45] */
points_by_id[10] += 31;
points_by_id[12] += 40;
points_by_id[45] += 30;
/* Print out. */
printf("Total stadings:\n");
for(i = 0; i < id_size; i++) {
if(points_by_id[i])
printf("%lu\t%u\n", (unsigned long)i, points_by_id[i]);
}
return EXIT_SUCCESS;
}
Upvotes: 1
Reputation: 347
Please let me know if you can change the array input. If yes, can't you just check ID every time you need to store a new element into the array? If ID matches with already stored element, simply let recordedPoint += point (that is, add the point to be stored directly into the total point recorded on the array). This way you won't create duplicate entries.
EDIT: since you can't change the input array, you can loop through the boat_score
array and the final_boat_score
array and check if the ID of the current boat have been recorded to the final_boat_score
array yet. If yes, then simply add it to the total score. I think the problem with your code is that you didn't loop through all the elements in your array, since your array size is definitely not total_boats - tot_members
. You also don't need that final_boat_scores[boat].ID = boat_scores[next_boat].ID;
line since it's redundant, your if statement only execute if that's true anyways.
your break;
statement also prematurely ends your loop, in this case you can't break out of the loop early because you don't really know how many entries you have with same ID, right?
//remember to initialize final_boat_score first with all IDs you have
for (int i = 0; i < final_boat_score_size; i++) {
//initialize the total point = 0 first
final_boat_score[i].points = 0;
//then loop through your input data
for (int j = 0; j < boat_score_size; i++) {
//if there exist an input element boat_score[j] with the same ID
//as the current final_boat_score[i] element, add its points to the total
if (final_boat_score[i].ID == boat_score[j].ID) {
final_boat_score[i].points += boat_score[j].points;
}
}
}
This won't delete the original array though so you will needs to delete it yourself if you don't need it anymore. I hope this help!
Upvotes: 3