DaveB
DaveB

Reputation: 3083

Java find most common value in HashSet

Ok bit of a basic question here but I wanted to know what the best way to go about this...

I have a HashSet that I am adding objects to, the .add() method will only add an object if it is not already present. But what I want to do is add ALL objects, then at the end get the following results..

-Number of unique (distinct) objects
-The average frequency of objects

Could someone point me in the right direction?

Thanks in advance

Upvotes: 0

Views: 4014

Answers (6)

Saul
Saul

Reputation: 18061

HashSet isn't really suited for keeping track of individual counts but HashMap is nearly perfect.

import java.util.HashMap;
import java.util.Map;

public class Count<K, V> extends HashMap<K, V> {

    // Counts unique objects
    public void add(K o) {
        int count = this.containsKey(o) ? ((Integer)this.get(o)).intValue() + 1 : 1;
        super.put(o, (V) new Integer(count));
    }

    // Demonstration
    public static void main(String[] args) {

        Count<Object, Integer> c = new Count<Object, Integer>();

        String one = "one";
        String two = "two";
        String six = "six";

        c.add(one);
        c.add(two);
        c.add(two);
        c.add(six);
        c.add(six);
        c.add(six);

        System.out.println("Number of distinct objects: " + c.size());

        System.out.println("Frequency of different objects: ");

        for (Map.Entry<Object, Integer> entry : c.entrySet()) {
            System.out.println(entry.getKey() + " - " + entry.getValue());
        }
    }
}

When run, this standalone snippet outputs

Number of distinct objects - 3
Frequency of different objects:
two - 2
one - 1
six - 3

Upvotes: 2

Use a HashMap. Use the entries as key, and map them to an Integer to keep the count.

EDIT: you may want to wrap the HashMap, to make sure that every time an object is added or removed, the counter is modified appropriately.
To get you started:

class MapWrapper<Key>
{
    private Map<Key,Integer> map = new HashMap<Key, Integer>();

    void add( Key key )
    {
        Integer n = map.get( key );
        if ( n == null )
        {
            map.put( key, 1 );
        }
        else
        {
            map.put( key, new Integer( n + 1 ));
        }
    }

    int occurrences( Key k )
    {
        Integer n = map.get( k );
        if ( n == null )
        {
            return 0;
        }
        else
        {
            return n;
        }
    }
}

Upvotes: 7

卢声远 Shengyuan Lu
卢声远 Shengyuan Lu

Reputation: 32014

Guava HashMultiset is a convenient choice. For instance:

HashMultiset<String> multiSet = HashMultiset.create();
multiSet.add("a");
multiSet.add("a");
multiSet.add("b");

Assert.assertEquals(2, multiSet.count("a"));//count "a" 
Assert.assertEquals(3, multiSet.size());//set size
Assert.assertEquals(2, multiSet.elementSet().size());//unique (distinct) size 

Upvotes: 1

se.solovyev
se.solovyev

Reputation: 1901

For such case i use my own implementation of Map interface:

/*
* Providers easily work with maps of lists
* */
public interface ManyValuedMap<K, V> extends Cloneable, Map<K, List<V>>, Serializable{

    public List<V> put(K key, V... values);
    public void clear(K key);
    public ManyValuedMap<K, V> clone();
    public void sort(Comparator<? super V> c);
    public List<V> getAllValues();
    public Collection<List<V>> values(Comparator<? super K> c);
    public void lock();
    public Map<K, List<V>> toMap();

}

And implementation:

/**
 * in ManyValuedMap can be stored lists of elements identificated by some key
 * */
public class ManyValuedHashMap<K, V> implements ManyValuedMap<K, V>, Serializable {

    //linked hash map guarantees right key order
    private Map<K, List<V>> map = new LinkedHashMap<K, List<V>>();
    private boolean isNeedToCheckUniqueness;
    private boolean lock = false;

    /**
     * @param needToCheckUniqueness if true then every time when element added uniqueness will be checked
     * */
    public ManyValuedHashMap(boolean needToCheckUniqueness) {
        isNeedToCheckUniqueness = needToCheckUniqueness;
    }

    public ManyValuedHashMap() {
        this(false);
    }

    public ManyValuedHashMap<K, V> put2 (K key, List<V> newValues ) {
        put(key, newValues);
        return this;
    }

    public List<V> put ( K key, List<V> newValues ) {
        if ( newValues == null ) {
            return put(key, (V)null);
        } else if ( newValues.isEmpty() ) {
            return put(key, (V)null);
        } else {
            //noinspection unchecked
            return put(key, (V[])newValues.toArray() );
        }
    }

    public List<V> put(K key, V... newValues) {
        checkLock();
        List<V> curValues = null;
        if (newValues != null && key != null) {
            curValues = this.map.get(key);

            if (curValues == null) {
                //new values  - add
                curValues = new ArrayList<V>();
                curValues.addAll(Arrays.asList(newValues));
                this.map.put(key, curValues);
            } else {
                // for this key values were added
                if (isNeedToCheckUniqueness) {
                    //if is need to check uniqueness - check
                    Integer index;
                    for (V newValue : newValues) {
                        index = null;
                        for (int i = 0; i < curValues.size(); i++) {
                            if (curValues.get(i).equals(newValue)) {
                                index = i;
                                break;
                            }
                        }
                        if (index == null) {
                            curValues.add(newValue);
                        } /*else {
                            //no need to add
                            //this value is already stored in map
                        }*/
                    }
                } else {
                    //otherwise add
                    curValues.addAll(Arrays.asList(newValues));
                }
            }
        } else if (key != null) {
            curValues = this.map.get(key);

            if (curValues == null) {
                curValues = new ArrayList<V>();
                this.map.put(key, curValues);
            }
        }

        return curValues;
    }

    public boolean containsValue(Object value) {
        boolean result = false;
        for (List<V> values : this.map.values()) {
            for (V v : values) {
                if (v.equals(value)) {
                    result = true;
                    break;
                }
            }
            if (result) {
                break;
            }
        }
        return result;
    }

    public List<V> get(Object key) {
        return this.map.get(key);
    }

    public boolean containsKey(Object key) {
        return this.map.containsKey(key);
    }

    public boolean isEmpty() {
        return this.map.isEmpty();
    }

    public int size() {
        int size = 0;
        for (List<V> vs : map.values()) {
            size += vs.size();
        }
        return size;
    }

    public List<V> remove(Object key) {
        checkLock();
        return this.map.remove(key);
    }

    @Override
    public void putAll(Map<? extends K, ? extends List<V>> m) {
        checkLock();
        this.map.putAll(m);
    }

    public void clear() {
        checkLock();
        this.map.clear();
    }

    @Override
    public void clear(K key) {
        checkLock();
        List<V> curValues = this.map.get(key);
        if ( curValues != null ) {
            curValues.clear();
        }
    }

    public Set<K> keySet() {
        return this.map.keySet();
    }

    public Collection<List<V>> values() {
        return this.map.values();
    }

    public Set<Map.Entry<K, List<V>>> entrySet() {
        return this.map.entrySet();
    }

    public Map<K, List<V>> toMap() {
        return new HashMap<K, List<V>>(map);
    }

    @Override
    public ManyValuedHashMap<K, V> clone() {
        ManyValuedHashMap<K, V> clone = null;
        try {
            //noinspection unchecked
            clone = (ManyValuedHashMap<K, V>)super.clone();
            //IMPORTANT: NOT DEEP CLONE
            //noinspection unchecked
            clone.map = new LinkedHashMap<K, List<V>>();
            clone.map.putAll(this.map);
        } catch (CloneNotSupportedException e) {
            Logger.getLogger(this.getClass()).error(e.getMessage(), e);
        }
        return clone;
    }

    @Override
    public void sort(Comparator<? super V> c) {
        for (List<V> list : map.values()) {
            Collections.sort(list, c);
        }
    }

    @Override
    public List<V> getAllValues() {
        final List<V> result = new ArrayList<V>();
        for (List<V> list : map.values()) {
            result.addAll(list);
        }
        return result;
    }

    public Collection<List<V>> values(Comparator<? super K> c) {
        List<Map.Entry<K, List<V>>> entries = new ArrayList<Map.Entry<K, List<V>>>(entrySet());

        Collections.sort(entries, new EntryComparator(c));

        Collection<List<V>> result = new ArrayList<List<V>>();

        for (Map.Entry<K, List<V>> entry : entries) {
            result.add(entry.getValue());
        }

        return result;
    }

    private class EntryComparator implements Comparator<Map.Entry<K, List<V>>>{

        private Comparator<? super K> keyComparator = null;

        private EntryComparator(Comparator<? super K> keyComparator) {
            this.keyComparator = keyComparator;
        }

        @Override
        public int compare(Map.Entry<K, List<V>> o1, Map.Entry<K, List<V>> o2) {
            return keyComparator.compare(o1.getKey(), o2.getKey());
        }
    }

    @Override
    public void lock() {
        this.lock = true;
    }

    private void checkLock () {
        if ( this.lock ) {
            throw new UnsupportedOperationException();
        }
    }
}

The behavior is next:

  1. List item
  2. If you try to add value with key not presented in map then new List element will be created and stored in map with specified key (value will be added to the list of course)
  3. If key is already in map then: if isNeedToCheckUniqueness == false new value just will be added to the end of list otherwise ( isNeedToCheckUniqueness == true) value will be added to the list only in case if list doesn't already contain it.

You can easily count number of elements by key (frequencies) by getting size of list. You can get either first or last element of list in order to get first or last added value with specified key.

Upvotes: 0

b.buchhold
b.buchhold

Reputation: 3906

You can either just use a (Hash)Map instead an keep the count for each distinct object as value in the map, or you can keep using a set but somewhere count all calls to add.

The total number of objects inserted is either what you counted or the sum over all values in the map (iterate over the EntrySet). The number of distinct objects is always size() of your map / set and the avg. frequency obviously is the quotient.

Upvotes: 0

Jon Skeet
Jon Skeet

Reputation: 1502016

The number of distinct objects will just be the size of the hash set afterwards.

Depending on what you mean by "average frequency" you may be okay with source.size() / set.size()... (possibly casting one of the operands to double to force floating point arithmetic if you want). If you can elaborate on what you need with some examples, we may be able to help more.

Upvotes: 1

Related Questions