Reputation: 273802
How do I do on-the-fly search & replace in a Java Stream (input or output)?
I don't want to load the stream into memory or to a file.
I just see the bytes passing by and I need to do some replacements. The sequences being replaced are short (up to 20 bytes).
Upvotes: 3
Views: 10090
Reputation: 399
I got some good ideas from the link provided and ended up writing a small class to handle replacement of $VAR$ variables in a stream. For posterity:
public class ReplacingOutputStream extends OutputStream {
private static final int DOLLAR_SIGN = "$".codePointAt(0);
private static final int BACKSLASH = "\\".codePointAt(0);
private final OutputStream delegate;
private final Map<String, Object> replacementValues;
private int previous = Integer.MIN_VALUE;
private boolean replacing = false;
private ArrayList<Integer> replacement = new ArrayList<Integer>();
public ReplacingOutputStream(OutputStream delegate, Map<String, Object> replacementValues) {
this.delegate = delegate;
this.replacementValues = replacementValues;
}
public @Override void write(int b) throws IOException {
if (b == DOLLAR_SIGN && previous != BACKSLASH) {
if (replacing) {
doReplacement();
replacing = false;
} else {
replacing = true;
}
} else {
if (replacing) {
replacement.add(b);
} else {
delegate.write(b);
}
}
previous = b;
}
private void doReplacement() throws IOException {
StringBuilder sb = new StringBuilder();
for (Integer intval : replacement) {
sb.append(Character.toChars(intval));
}
replacement.clear();
String oldValue = sb.toString();
Object _newValue = replacementValues.get(oldValue);
if (_newValue == null) {
throw new RuntimeException("Could not find replacement variable for value '"+oldValue+"'.");
}
String newValue = _newValue.toString();
for (int i=0; i < newValue.length(); ++i) {
int value = newValue.codePointAt(i);
delegate.write(value);
}
}
}
Upvotes: 0
Reputation: 3744
You can use the class provided here if static replacement rules are enough for you.
Upvotes: 4
Reputation: 60236
You could implement a deterministic finite automaton which looks at each byte once only (e.g. no lookbehind is required), so that you would basically stream the input through a buffer holding max as many characters as the length of your pattern, outputting the pattern on a match or overflowing (non-matched) characters when advancing in the pattern. Runtime is linear after preparation of the pattern.
Wikipedia has some information on pattern matching and how that works in theory.
Upvotes: 1