Faster method to extract distinct string from an Arraylist

I have an ArrayList of Dico and I try to extract a distinct string from Arraylist of Dico.

This is the Dico class.

public class Dico implements Comparable {
private final String m_term;
private double m_weight;
private final int m_Id_doc;

public Dico(int Id_Doc, String Term, double tf_ief) {
    this.m_Id_doc = Id_Doc;
    this.m_term = Term;
    this.m_weight = tf_ief;
}

public String getTerm() {
    return this.m_term;
}

public double getWeight() {
    return this.m_weight;
}

public void setWeight(double weight) {
    this.m_weight = weight;
}

public int getDocId() {
    return this.m_Id_doc;
}
}

I use this function to extract 1000 distinct value from middle of this array: i start form the middle and i take only distinct value in both direction left and right

public static List <String> get_sinificativ_term(List<Dico> dico)
 {
   List <String> term =  new ArrayList();
   int  pos_median= ( dico.size() / 2 );
   int count=0;
   int i=0;
   int j=0;   
  String temp_d = dico.get(pos_median).getTerm();
  String temp_g =temp_d;
  term.add(temp_d);

 while(count < 999) // count of element 
  {   
   if(!temp_d.equals(dico.get( ( pos_median + i) ).getTerm()))

 {    
     temp_d = dico.get(( pos_median + i)).getTerm(); // save current term in temp
     //  System.out.println(temp_d);
       term.add(temp_d);  // add term to list                            
       i++;     // go to the next value-->right
       count++;
     //  System.out.println(temp_d);
   }

  else
       i++; // go to the next value-->right

  if(!temp_g.equals(dico.get( ( pos_median+j ) ).getTerm()))

 {    
       temp_g = dico.get(( pos_median+j )).getTerm();

      term.add(temp_g );// add term to array
     //  System.out.println(temp_g);
      j--; //  go to the next value-->left

      count++;
   }
  else 
         j--;//  go to the next value-->left

}      
    return term;
 }

I would like to make my solution more faster than this function,if is possible can i make this with Java SE 8 Streams ?

Upvotes: 2

Answers (2)

Misha

Reputation: 28163

Streams will not make it faster but can make it much simpler and clearer.

Here's the simplest version. It will take all list indexes, sort them by distance to the middle of the list, get the corresponding term, filter out duplicates and limit to 1000 elements. It will certainly be slower than your iterative code, but much easier to follow because the code neatly mirrors its English description:

public static List<String> get_sinificativ_term(List<Dico> dicolist) {
    int size = dicolist.size();

    return IntStream.range(0, size)
            .boxed()
            .sorted(comparing(i -> Math.abs(size / 2 - i)))
            .map(dicolist::get)
            .map(Dico::getTerm)
            .distinct()
            .limit(1000)
            .collect(toList());
}

If your list is really huge and you want to avoid sorting it, you can trade away some simplicity for performance. This version does a bit of math to go right-left-right-left from center:

public static List<String> get_sinificativ_term(List<Dico> dicolist) {
    int size = dicolist.size();

    return IntStream.range(0, size)
            .map(i -> i % 2 == 0 ? (size + i) / 2 : (size - i - 1) / 2)
            .mapToObj(i -> dicolist.get(i).getTerm())
            .distinct()
            .limit(1000)
            .collect(toList());
}

Upvotes: 1

user2336315

Reputation: 16067

Can't you do something like this?

public static List <String> get_sinificativ_term(List<Dico> dico) {
    List<String> list = dico.stream()
                            .map(Dico::getTerm)
                            .distinct()
                            .limit(1000)
                            .collect(Collectors.toList());
    if(list.size() != 1000) {
         throw new IllegalStateException("Need at least 1000 distinct values");
    }
    return list;
}

You need to check the size because you can have less than 1000 distinct values. If efficiency is a concern you can try to run the pipeline in parallel and measure if its faster.

Upvotes: 0

Faster method to extract distinct string from an Arraylist

Answers (2)

Related Questions