Reputation: 3083
Ok bit of a basic question here but I wanted to know what the best way to go about this...
I have a HashSet that I am adding objects to, the .add() method will only add an object if it is not already present. But what I want to do is add ALL objects, then at the end get the following results..
-Number of unique (distinct) objects
-The average frequency of objects
Could someone point me in the right direction?
Thanks in advance
Upvotes: 0
Views: 4014
Reputation: 18061
HashSet isn't really suited for keeping track of individual counts but HashMap is nearly perfect.
import java.util.HashMap;
import java.util.Map;
public class Count<K, V> extends HashMap<K, V> {
// Counts unique objects
public void add(K o) {
int count = this.containsKey(o) ? ((Integer)this.get(o)).intValue() + 1 : 1;
super.put(o, (V) new Integer(count));
}
// Demonstration
public static void main(String[] args) {
Count<Object, Integer> c = new Count<Object, Integer>();
String one = "one";
String two = "two";
String six = "six";
c.add(one);
c.add(two);
c.add(two);
c.add(six);
c.add(six);
c.add(six);
System.out.println("Number of distinct objects: " + c.size());
System.out.println("Frequency of different objects: ");
for (Map.Entry<Object, Integer> entry : c.entrySet()) {
System.out.println(entry.getKey() + " - " + entry.getValue());
}
}
}
When run, this standalone snippet outputs
Number of distinct objects - 3
Frequency of different objects:
two - 2
one - 1
six - 3
Upvotes: 2
Reputation: 8255
Use a HashMap. Use the entries as key, and map them to an Integer to keep the count.
EDIT: you may want to wrap the HashMap, to make sure that every time an object is added or removed, the counter is modified appropriately.
To get you started:
class MapWrapper<Key>
{
private Map<Key,Integer> map = new HashMap<Key, Integer>();
void add( Key key )
{
Integer n = map.get( key );
if ( n == null )
{
map.put( key, 1 );
}
else
{
map.put( key, new Integer( n + 1 ));
}
}
int occurrences( Key k )
{
Integer n = map.get( k );
if ( n == null )
{
return 0;
}
else
{
return n;
}
}
}
Upvotes: 7
Reputation: 32014
Guava HashMultiset is a convenient choice. For instance:
HashMultiset<String> multiSet = HashMultiset.create();
multiSet.add("a");
multiSet.add("a");
multiSet.add("b");
Assert.assertEquals(2, multiSet.count("a"));//count "a"
Assert.assertEquals(3, multiSet.size());//set size
Assert.assertEquals(2, multiSet.elementSet().size());//unique (distinct) size
Upvotes: 1
Reputation: 1901
For such case i use my own implementation of Map interface:
/*
* Providers easily work with maps of lists
* */
public interface ManyValuedMap<K, V> extends Cloneable, Map<K, List<V>>, Serializable{
public List<V> put(K key, V... values);
public void clear(K key);
public ManyValuedMap<K, V> clone();
public void sort(Comparator<? super V> c);
public List<V> getAllValues();
public Collection<List<V>> values(Comparator<? super K> c);
public void lock();
public Map<K, List<V>> toMap();
}
And implementation:
/**
* in ManyValuedMap can be stored lists of elements identificated by some key
* */
public class ManyValuedHashMap<K, V> implements ManyValuedMap<K, V>, Serializable {
//linked hash map guarantees right key order
private Map<K, List<V>> map = new LinkedHashMap<K, List<V>>();
private boolean isNeedToCheckUniqueness;
private boolean lock = false;
/**
* @param needToCheckUniqueness if true then every time when element added uniqueness will be checked
* */
public ManyValuedHashMap(boolean needToCheckUniqueness) {
isNeedToCheckUniqueness = needToCheckUniqueness;
}
public ManyValuedHashMap() {
this(false);
}
public ManyValuedHashMap<K, V> put2 (K key, List<V> newValues ) {
put(key, newValues);
return this;
}
public List<V> put ( K key, List<V> newValues ) {
if ( newValues == null ) {
return put(key, (V)null);
} else if ( newValues.isEmpty() ) {
return put(key, (V)null);
} else {
//noinspection unchecked
return put(key, (V[])newValues.toArray() );
}
}
public List<V> put(K key, V... newValues) {
checkLock();
List<V> curValues = null;
if (newValues != null && key != null) {
curValues = this.map.get(key);
if (curValues == null) {
//new values - add
curValues = new ArrayList<V>();
curValues.addAll(Arrays.asList(newValues));
this.map.put(key, curValues);
} else {
// for this key values were added
if (isNeedToCheckUniqueness) {
//if is need to check uniqueness - check
Integer index;
for (V newValue : newValues) {
index = null;
for (int i = 0; i < curValues.size(); i++) {
if (curValues.get(i).equals(newValue)) {
index = i;
break;
}
}
if (index == null) {
curValues.add(newValue);
} /*else {
//no need to add
//this value is already stored in map
}*/
}
} else {
//otherwise add
curValues.addAll(Arrays.asList(newValues));
}
}
} else if (key != null) {
curValues = this.map.get(key);
if (curValues == null) {
curValues = new ArrayList<V>();
this.map.put(key, curValues);
}
}
return curValues;
}
public boolean containsValue(Object value) {
boolean result = false;
for (List<V> values : this.map.values()) {
for (V v : values) {
if (v.equals(value)) {
result = true;
break;
}
}
if (result) {
break;
}
}
return result;
}
public List<V> get(Object key) {
return this.map.get(key);
}
public boolean containsKey(Object key) {
return this.map.containsKey(key);
}
public boolean isEmpty() {
return this.map.isEmpty();
}
public int size() {
int size = 0;
for (List<V> vs : map.values()) {
size += vs.size();
}
return size;
}
public List<V> remove(Object key) {
checkLock();
return this.map.remove(key);
}
@Override
public void putAll(Map<? extends K, ? extends List<V>> m) {
checkLock();
this.map.putAll(m);
}
public void clear() {
checkLock();
this.map.clear();
}
@Override
public void clear(K key) {
checkLock();
List<V> curValues = this.map.get(key);
if ( curValues != null ) {
curValues.clear();
}
}
public Set<K> keySet() {
return this.map.keySet();
}
public Collection<List<V>> values() {
return this.map.values();
}
public Set<Map.Entry<K, List<V>>> entrySet() {
return this.map.entrySet();
}
public Map<K, List<V>> toMap() {
return new HashMap<K, List<V>>(map);
}
@Override
public ManyValuedHashMap<K, V> clone() {
ManyValuedHashMap<K, V> clone = null;
try {
//noinspection unchecked
clone = (ManyValuedHashMap<K, V>)super.clone();
//IMPORTANT: NOT DEEP CLONE
//noinspection unchecked
clone.map = new LinkedHashMap<K, List<V>>();
clone.map.putAll(this.map);
} catch (CloneNotSupportedException e) {
Logger.getLogger(this.getClass()).error(e.getMessage(), e);
}
return clone;
}
@Override
public void sort(Comparator<? super V> c) {
for (List<V> list : map.values()) {
Collections.sort(list, c);
}
}
@Override
public List<V> getAllValues() {
final List<V> result = new ArrayList<V>();
for (List<V> list : map.values()) {
result.addAll(list);
}
return result;
}
public Collection<List<V>> values(Comparator<? super K> c) {
List<Map.Entry<K, List<V>>> entries = new ArrayList<Map.Entry<K, List<V>>>(entrySet());
Collections.sort(entries, new EntryComparator(c));
Collection<List<V>> result = new ArrayList<List<V>>();
for (Map.Entry<K, List<V>> entry : entries) {
result.add(entry.getValue());
}
return result;
}
private class EntryComparator implements Comparator<Map.Entry<K, List<V>>>{
private Comparator<? super K> keyComparator = null;
private EntryComparator(Comparator<? super K> keyComparator) {
this.keyComparator = keyComparator;
}
@Override
public int compare(Map.Entry<K, List<V>> o1, Map.Entry<K, List<V>> o2) {
return keyComparator.compare(o1.getKey(), o2.getKey());
}
}
@Override
public void lock() {
this.lock = true;
}
private void checkLock () {
if ( this.lock ) {
throw new UnsupportedOperationException();
}
}
}
The behavior is next:
You can easily count number of elements by key (frequencies) by getting size of list. You can get either first or last element of list in order to get first or last added value with specified key.
Upvotes: 0
Reputation: 3906
You can either just use a (Hash)Map instead an keep the count for each distinct object as value in the map, or you can keep using a set but somewhere count all calls to add.
The total number of objects inserted is either what you counted or the sum over all values in the map (iterate over the EntrySet). The number of distinct objects is always size() of your map / set and the avg. frequency obviously is the quotient.
Upvotes: 0
Reputation: 1502016
The number of distinct objects will just be the size of the hash set afterwards.
Depending on what you mean by "average frequency" you may be okay with source.size() / set.size()
... (possibly casting one of the operands to double
to force floating point arithmetic if you want). If you can elaborate on what you need with some examples, we may be able to help more.
Upvotes: 1