Reputation: 759
I have a stream of data as shown below and I wish to collect the data based on a condition.
Stream of data:
452857;0;L100;csO;20220411;20220411;EUR;000101435;+; ;F;1;EUR;000100000;+;
452857;0;L120;csO;20220411;20220411;EUR;000101435;+; ;F;1;EUR;000100000;+;
452857;0;L121;csO;20220411;20220411;EUR;000101435;+; ;F;1;EUR;000100000;+;
452857;0;L126;csO;20220411;20220411;EUR;000101435;+; ;F;1;EUR;000100000;+;
452857;0;L100;csO;20220411;20220411;EUR;000101435;+; ;F;1;EUR;000100000;+;
452857;0;L122;csO;20220411;20220411;EUR;000101435;+; ;F;1;EUR;000100000;+;
I wish to collect the data based on the index = 2 (L100,L121 ...) and store it in different lists of L120,L121,L122 etc using Java 8 streams. Any suggestions? Note: splittedLine array below is my stream of data.
For instance: I have tried the following but I think there's a shorter way:
List<String> L100_ENTITY_NAMES = Arrays.asList("L100", "L120", "L121", "L122", "L126");
List<List<String>> list= L100_ENTITY_NAMES.stream()
.map(entity -> Arrays.stream(splittedLine)
.filter(line -> {
String[] values = line.split(String.valueOf(DELIMITER));
if(values.length > 0){
return entity.equals(values[2]);
}
else{
return false;
}
}).collect(Collectors.toList())).collect(Collectors.toList());
Upvotes: 1
Views: 1236
Reputation: 23057
I would convert the semicolon-delimited lines to objects as soon as possible, instead of keeping them around as a serialized bunch of data.
First, I would create a model modelling our data:
public record LBasedEntity(long id, int zero, String lcode, …) { }
Then, create a method to parse the line. This can be as well an external parsing library, for this looks like CSV with semicolon as delimiter.
private static LBasedEntity parse(String line) {
String[] parts = line.split(";");
if (parts.length < 3) {
return null;
}
long id = Long.parseLong(parts[0]);
int zero = Integer.parseInt(parts[1]);
String lcode = parts[2];
…
return new LBasedEntity(id, zero, lcode, …);
}
Then the mapping is trivial:
Map<String, List<LBasedEntity>> result = Arrays.stream(lines)
.map(line -> parse(line))
.filter(Objects::nonNull)
.filter(lBasedEntity -> L100_ENTITY_NAMES.contains(lBasedEntity.lcode()))
.collect(Collectors.groupingBy(LBasedEntity::lcode));
map(line -> parse(line))
parses the line into an LBasedEntity
object (or whatever you call it);filter(Objects::nonNull)
filters out all null values produced by the parse method;filter
selects all entities of which the lcode
property is contained in the L100_ENTITY_NAMES
list (I would turn this into a Set
, to speed things up);Map
is with key-value pairs of L100_ENTITY_NAME
→ List<LBasedEntity>
.Upvotes: 1
Reputation: 338
you're asking for a shorter way to achieve the same, actually your code is good. I guess the only part that makes it look lengthy is the if/else check in the stream.
if (values.length > 0) {
return entity.equals(values[2]);
} else {
return false;
}
I would suggest introduce two tiny private methods to improve the readability, like this:
List<List<String>> list = L100_ENTITY_NAMES.stream()
.map(entity -> getLinesByEntity(splittedLine, entity)).collect(Collectors.toList());
private List<String> getLinesByEntity(String[] splittedLine, String entity) {
return Arrays.stream(splittedLine).filter(line -> isLineMatched(entity, line)).collect(Collectors.toList());
}
private boolean isLineMatched(String entity, String line) {
String[] values = line.split(DELIMITER);
return values.length > 0 && entity.equals(values[2]);
}
Upvotes: 0
Reputation: 129
You're effectively asking for what languages like Scala provide on collections: groupBy
. In Scala you could write:
splitLines.groupBy(_(2)) // Map[String, List[String]]
Of course, you want this in Java, and in my opinion, not using streams here makes sense due to Java's lack of a fold
or groupBy
function.
HashMap<String, ArrayList<String>> map = new HashMap<>();
for (String[] line : splitLines) {
if (line.length < 2) continue;
ArrayList<String> xs = map.getOrDefault(line[2], new ArrayList<>());
xs.addAll(Arrays.asList(line));
map.put(line[2], xs);
}
As you can see, it's very easy to understand, and actually shorter than the stream based solution.
I'm leveraging two key methods on a HashMap
.
The first is getOrDefault
; basically if the value associate with our key doesn't exist, we can provide a default. In our case, an empty ArrayList
.
The second is put
, which actually acts like a putOrReplace
because it lets us override the previous value associated with the key.
I hope that was helpful. :)
Upvotes: 0
Reputation: 88757
I'd rather change the order and also collect the data into a Map<String, List<String>>
where the key would be the entity name.
Assuming splittedLine
is the array of lines, I'd probably do something like this:
Set<String> L100_ENTITY_NAMES = Set.of("L100", ...);
String delimiter = String.valueOf(DELIMITER);
Map<String, List<String>> result =
Arrays.stream(splittedLine)
.map(line -> {
String[] values = line.split(delimiter );
if( values.length < 3) {
return null;
}
return new AbstractMap.SimpleEntry<>(values[2], line);
})
.filter(Objects::nonNull)
.filter(tempLine -> L100_ENTITY_NAMES.contains(tempLine.getEntityName()))
.collect(Collectors.groupingBy(Map.Entry::getKey,
Collectors.mapping(Map.Entry::getValue, Collectors.toList());
Note that this isn't necessarily shorter but has a couple of other advantages:
Upvotes: 1