Kiran
Kiran

Reputation: 5526

Creating list of unique objects

I get a set of elements by parsing a html document. There is a possibility that the elements may contain duplicates. What is the best way to list only unique elements?

I come from C++ background and see a possibility of doing it using a set and custom equality operation. However, not sure how to do it in Java. Appreciate any code that would help me do it the right and efficient way.

ArrayList<Element> values = new ArrayList<Element>();

// Parse the html and get the document
Document doc = Jsoup.parse(htmlDataInStringFormat);

// Go through each selector and find all matching elements
for ( String selector: selectors ) {

    //Find elements matching this selector
    Elements elements = doc.select(selector);

    //If there are no matching elements, proceed to next selector
    if ( 0 == elements.size() ) continue;

    for (Element elem: elements ){
        values.add(elem);
    }
}

if ( elements.size() > 0 ) {
    ????? // Need to remove duplicates here
}

Upvotes: 0

Views: 120

Answers (5)

Kiran
Kiran

Reputation: 5526

While the answers posted work if there is a possibility to modify the element, I cannot do that. I donot need a sorted set, hence here is the solution I found..

TreeSet<Element> nt = new TreeSet<Element>(new Comparator<Element>(){
        public int compare(Element a, Element b){
            if ( a == b ) 
                return 0;
            if ( (a.val - b.val) > 0 )
                return 1;
            return -1;
        }
    });

for (Element elem: elements ){
    nt.add(elem);
}

Upvotes: 0

Caleryn
Caleryn

Reputation: 1084

java.util.HashSet will give you an Unordered set there are also other extensions of java.util.Set in the API that will give you ordered sets or concurrent behaviour if needed.

Depending upon what the class Element is you may additionally need to implement the equals and hashCode functions on it. as per comments by @musical_coder.

eg:

Set<Element> set = new HashSet<Element>(elements);

in Order to provide an overridden equals method or Element I would create thin wrapper around the Element class for myself MyElement or something more sencibly named eg

    public static class MyElement extends Element {

        private final Element element;

        public MyElement(Element element){
            this.element = element;
        }

        // OverRide equals and Hashcode
        // Delegate all other methods
    }

and pass that into the set, ok so now I'm hoping the class isn't final. Effectivly wrapp all your elements in this class. Ah ElementWrapper that is a better name.

Upvotes: 3

deba
deba

Reputation: 176

Additionally override the equals and hashCode method of Element

class Element {
...

public boolean equals(Object o) {
    if (! (o instanceof Element)) {
    return false;
}
Element other = (Element)o;
//compare the elements of  this and o like
if (o.a != this.a) { return false;}
...

}
...
public int hashCode() {
    //compute a value that is will return equal hash code for equal objects
}
}

Upvotes: 0

M Sach
M Sach

Reputation: 34424

Use HashSet if you just want to avoid duplicate. Use Tree set if you want ordering alongwith avoiding duplicates

Upvotes: 1

Vikdor
Vikdor

Reputation: 24134

Add the elements to a java.util.HashSet and it would contain only unique elements.

Upvotes: 2

Related Questions