Reputation: 423
I've a requirement where I would like to use the Java Stream Api to process a stream of events from a system and apply a data cleanup process to remove repeated events. This is removing the same event repeated multiple times in sequence, not creating a list of distinct events. Most of the Java Stream api examples available online target creating a distinct output from a given input.
Example, for input stream
[a, b, c, a, a, a, a, d, d, d, c, c, e, e, e, e, e, e, f, f, f]
the output List or Stream should be
[a, b, c, a, d, c, e, f]
My current implementation (not using Stream api) looks like
public class Test {
public static void main(String[] args) {
String fileName = "src/main/resources/test.log";
try {
List<String> list = Files.readAllLines(Paths.get(fileName));
LinkedList<String> acc = new LinkedList<>();
for (String line: list) {
if (acc.isEmpty())
acc.add(line);
else if (! line.equals(acc.getLast()) )
acc.add(line);
}
System.out.println(list);
System.out.println(acc);
} catch (IOException ioe) {
ioe.printStackTrace();
}
}
}
Output,
[a, b, c, a, a, a, a, d, d, d, c, c, e, e, e, e, e, e, f, f, f]
[a, b, c, a, d, c, e, f]
I've tried various example with reduce, groupingBy, etc., without success. I can't seem to find a way to compare a stream with the last element in my accumulator, if there is such a possibilty.
Upvotes: 11
Views: 4408
Reputation: 629
Another concise syntax would be
AtomicReference<Character> previous = new AtomicReference<>(null);
Stream.of('a', 'b', 'b', 'a').filter(cur -> !cur.equals(previous.getAndSet(cur)));
Upvotes: 3
Reputation: 1261
With Java 7, you can do this using iterator.
Iterator<Integer> iterator = list.values().iterator();
Integer previousValue = null;
while(iterator.hasNext()) {
Integer currentValue = iterator.next();
if(currentValue.equals(previousValue)){
iterator.remove();
}
previousValue = currentValue;
}
Upvotes: 0
Reputation: 5606
EDIT: as commented by @Bolzano, this approach does not meet the requirement.
If t
is the input stream then
Map<String,Boolean> s = new HashMap<>();
Stream<String> u = t.filter(e -> s.put(e, Boolean.TRUE)==null);
will produce an Stream of unique elements without creating a List.
Then a plain
List<String> m = u.collect(Collectors.toList());
can create a List on unique elements.
I do not understand why such lengthy solutions as @CKing and @Anton propose would be required? Am I missing something?
Upvotes: 0
Reputation: 15212
You can use IntStream
to get hold of the index positions in the List
and use this to your advantage as follows :
List<String> acc = IntStream
.range(0, list.size())
.filter(i -> ((i < list.size() - 1 && !list.get(i).equals(list
.get(i + 1))) || i == list.size() - 1))
.mapToObj(i -> list.get(i)).collect(Collectors.toList());
System.out.println(acc);
Explanation
IntStream.range(0,list.size())
: Returns a sequence of primitive int-valued elements which will be used as the index positions to access the list.filter(i -> ((i < list.size() - 1 && !list.get(i).equals(list.get(i + 1) || i == list.size() - 1))
: Proceed only if the element at current index position is not equal to the element at the next index position or if the last index position is reachedmapToObj(i -> list.get(i)
: Convert the stream to a Stream<String>
.collect(Collectors.toList())
: Collect the results in a List.Upvotes: 9
Reputation: 542
Please try this solution :
public class TestDuplicatePreviousEvent {
public static void main(String[] args) {
List<Integer> inputData = new ArrayList<>();
List<Integer> outputData = new ArrayList<>();
inputData.add(1);
inputData.add(2);
inputData.add(2);
inputData.add(3);
inputData.add(3);
inputData.add(3);
inputData.add(4);
inputData.add(4);
inputData.add(4);
inputData.add(4);
inputData.add(1);
AtomicInteger index = new AtomicInteger();
Map<Integer, Integer> valueByIndex = inputData.stream().collect(Collectors.toMap(i -> index.incrementAndGet(), i -> i));
outputData = valueByIndex.entrySet().stream().filter(i -> !i.getValue().equals(valueByIndex.get(i.getKey() - 1))).map(x -> x.getValue()).collect(Collectors.toList());
System.out.println(outputData);
}
}
Output : [1, 2, 3, 4, 1]
Solution without map :
public class TestDuplicatePreviousEvent {
public static void main(String[] args) {
List<Integer> inputData = new ArrayList<>();
List<Integer> outputData = new ArrayList<>();
inputData.add(1);
inputData.add(2);
inputData.add(2);
inputData.add(3);
inputData.add(3);
inputData.add(3);
inputData.add(4);
inputData.add(4);
inputData.add(4);
inputData.add(4);
inputData.add(1);
inputData.add(1);
inputData.add(1);
inputData.add(4);
inputData.add(4);
AtomicInteger index = new AtomicInteger();
outputData = inputData.stream().filter(i -> filterInputEvents(i, index, inputData)).collect(Collectors.toList());
System.out.println(outputData);
}
private static boolean filterInputEvents(Integer i, AtomicInteger index, List<Integer> inputData) {
if (index.get() == 0) {
index.incrementAndGet();
return true;
}
return !(i.equals(inputData.get(index.getAndIncrement() - 1)));
}
}
Upvotes: -1
Reputation: 11739
You might use a custom Collector to achieve your goal. Please find details below:
Stream<String> lines = Files.lines(Paths.get("distinct.txt"));
LinkedList<String> values = lines.collect(Collector.of(
LinkedList::new,
(list, string) -> {
if (list.isEmpty())
list.add(string);
else if (!string.equals(list.getLast()))
list.add(string);
},
(left, right) -> {
left.addAll(right);
return left;
}
));
values.forEach(System.out::println);
However it might have some issues when parallel
stream is used.
Upvotes: 4