user15923931
user15923931

Reputation:

Java: the fastest way to filter List with 1m of objects

Now i have List of ProductDTO, and Product.

This list can contains 100 objects, and can also contains 1m of objects.

This list i am reading from csv file.

How i am filtering it now:

productDtos.parralelStream()
    .filter(i -> i.getName.equals(product.getName))
    .filter(i -> Objects.equals(i.getCode(), product.getCode()))
    .map(Product::new)
    // getting object here

So, which is the best way to parse it ? I thought i should use multithreading, one thread will start from beggining of list, other will start from the end of list.

Any ideas how to improve spreed of filtering list in big data cases ? Thank you

Upvotes: 0

Views: 726

Answers (2)

Volodya Lombrozo
Volodya Lombrozo

Reputation: 3466

First of all, I see, you have already uploaded all productsDtos right in memory. It could lead you to very high memory consumption. I suggest you read CSV files by rows and filter them one by one. In that case, your code might look like the next:

public class Csv {
    public static void main(String[] args) {
        File file = new File("your.csv");
        try (final BufferedReader br = new BufferedReader(new FileReader(file))) {
            final List<String> filtered = br.lines().parallel()
                    .map(Csv::toYourDTO)
                    .filter(Csv::yourFilter)
                    .collect(Collectors.toList());
            System.out.println(filtered);
        } catch (IOException e) {
            //todo something with the error
        }
    }

    private static boolean yourFilter(String s) {
        return true; //todo
    }

    private static String toYourDTO(String s) {
        return "";//todo
    }
}

Upvotes: 1

user3088799
user3088799

Reputation: 165

I used to construct map and use get on it to avoid filter on loop.

For instance, if you Have N code for 1 product, you can do :

Map<String, Map<String, List<ProductDTO>>> productDtoByNameAndCode= productDtos.stream().collect(groupingBy(ProductDTO::getName, groupingBy(ProductDTO::getCode)));

Then you will just have to do for each product :

List<ProductDTO> correspondingProductDTOs = productDtoByNameAndCode.get(product.getName()).get(Product.getCode());

Like that, you haven't to filter all your list every time for each product.

Upvotes: 0

Related Questions