James
James

Reputation: 3184

How can I take a Java Set of size X and break into X/Y Sets?

I have a Java Set (specifically HashSet). Suppose it has a size of 10k. How can I break it into 5 Sets each of size 2k?

Upvotes: 7

Views: 28675

Answers (6)

Ravindranadh Inavolu
Ravindranadh Inavolu

Reputation: 33

Late Answer but you can try,

  1. By converting it as List, Using Guava library to use Lists.partition(list, PARTITION_VALUE).

    Dependency:

     <dependency>
          <groupId>com.google.guava</groupId>
          <artifactId>guava</artifactId>
          <version>31.1-jre</version>
      </dependency>
    

    Usage:

    import com.google.common.collect.Lists;
    import java.util.*;
    .
    .
    Set<String> hashSet = new HashSet<>();
    //Fill data to hashSet
    List<String> hashSetAsList = new ArrayList<>(hashSet);
    List<List<String>> partitionedLists = Lists.partition(hashSetAsList, PARTITION_VALUE);
    
  2. By using List.subList(INDEX_START,INDEX_END)

    import java.util.*;
    .
    .
    List<String> hashSetAsList = new ArrayList<>(hashSet);
    List<String> subList = hashSetAsList.subList(INDEX_START,INDEX_END);
    //INDEX_END will be excluded
    
  3. By Converting it to TreeSet & SortedSet and using Set.subSet(INDEX_START,INDEX_END) ... this way set will be sorted

    Set<String> hashSet = new HashSet<>();
    SortedSet<String> sortedHashSet= new TreeSet<>(hashSet);
    Set<String> subSet_1 = sortedHashSet.subSet(INDEX_START,INDEX_END);
    //INDEX_END will be excluded
    

Upvotes: 0

wchargin
wchargin

Reputation: 16047

This method will split the elements of the set so that the first set contains the first 2000, the second contains the next 2000, etc.

public static <T> List<Set<T>> split(Set<T> original, int count) {
    // Create a list of sets to return.
    ArrayList<Set<T>> result = new ArrayList<Set<T>>(count);

    // Create an iterator for the original set.
    Iterator<T> it = original.iterator();

    // Calculate the required number of elements for each set.
    int each = original.size() / count;

    // Create each new set.
    for (int i = 0; i < count; i++) {
        HashSet<T> s = new HashSet<T>(original.size() / count + 1);
        result.add(s);
        for (int j = 0; j < each && it.hasNext(); j++) {
            s.add(it.next());
        }
    }
    return result;
}

//As example, in your code...

Set<Integer> originalSet = new HashSet<Integer>();
// [fill the set...]
List<Set<Integer>> splitSets = split(originalSet, 5);
Set<Integer> first = splitSets.get(0); // etc.

Upvotes: 8

user2327870
user2327870

Reputation: 271

Guava has libraries to partition Iterable classes. The Iterables is a utility class which has static methods to partition Iterable classes. The return value is a Iterable of lists though. The given code shows how to do that.

Set<Integer> myIntSet = new HashSet<Integer>();
// fill the set
Iterable<List<Integer>> lists = Iterables.partition(myIntSet, SIZE_EACH_PARTITION);

Upvotes: 27

stefaan dutry
stefaan dutry

Reputation: 1106

I've written something that does the splitting of the set.

It uses intermediate arrays and Lists.

It uses the Arrays.asList and the Arrays.copyOfRange methods.

import java.util.Arrays;
import java.util.HashSet;
import java.util.Set;


public class SetSplitTest {
    //define and initialize set
    private static Set<Integer> largeSet;
    static {
        largeSet  = new HashSet<Integer>();

        for (int i = 0; i < 10000; i++) {
            largeSet.add(i);
        }
    }


    public static void main() {
        System.out.println(largeSet);
        int amountOfSets = 5; //amount of subsets wanted
        Set<Integer>[] subsets = new Set[amountOfSets]; //array holding the subsets

        Integer[] largesetarray =  largeSet.toArray(new Integer[largeSet.size()]);

        for (int i = 1; i <= amountOfSets; i++) {
            int fromIndex = (i-1) * largeSet.size() / amountOfSets;
            int toIndex = i * largeSet.size() / amountOfSets - 1; 
            Set<Integer> subHashSet = new HashSet<Integer>();
            subHashSet.addAll(Arrays.asList(Arrays.copyOfRange(largesetarray, fromIndex, toIndex)));

            subsets[i - 1] = subHashSet;
        }

        for (Set<Integer> subset : subsets) {
            System.out.println(subset);
        }
    }
}

This definately not the most elegant solution, but it's the best i could think off at the moment when not wanting to loop the sets yourself.

Upvotes: 0

mszalbach
mszalbach

Reputation: 11470

If you do not want to write it by your own have a look at guava. The Lists class have a method partition(List, int) to split a list into multiple lists with the specified size. See Guava Lists

Upvotes: 0

Tom
Tom

Reputation: 1454

Iterate over the entire set, and add the first 2000 elements to a first new set, the 2nd 2000 elements to the 2nd new set, etc.

Upvotes: -1

Related Questions