Reputation: 15864
I have a list of items, say List<Item> listOfItems
and each item has a key, say String cluster_key
. I want to cluster all items with the same cluster_key
into a bucket, giving me the result List<Bucket> listOfBuckets
. The list of buckets starts off empty.
Any suggests on how to do this elegantly Java? Perhaps with hashing?
I can think of 2 implementations that are less than elegant:
listOfItems
, and for each item, traverse the list of buckets until we find a match. If we find a match, add the item to the bucket. Else, .
for item in listOfItems {
for bucket in listOfBuckets {
if item.getKey() equals bucket.getKey()
add item to bucket
else
create new bucket
add item to bucket
add bucket to listOfBuckets
}
}
.
sort listOfItems by their cluster_key;
get first item from listOfItems;
create a bucket, currentBucket, with key: firstItem.getKey()
add first item to bucket
for item in listOfItems, starting at the second item {
if item.getKey() equals currentBucket.getKey()
add item to currentBucket
else
create new bucket
add item to new bucket
add new bucket to listOfBuckets
set new bucket to currentBucket
}
Upvotes: 2
Views: 2183
Reputation: 32402
The fastest way to group items by the same key is to iterate through your list and add each item to the correct bucket (creating buckets when needed) as you go, which is similar to your first example.
But if you use a HashMap for your bucketList, you can add items to your bucket in constant time, which will give you an algorithm with O(n) instead of O(n^2) complexity .
Not tested but you get the idea
HashMap<String,ArrayList<Item>> bucketList = new HashMap();
for (Item i : listOfItems) {
if(!bucketList.containsKey(i.getKey()) {
bucketList.put(i.getKey(),new ArrayList());
}
buckList.get(i.getKey()).add(item);
}
Upvotes: 3