Reputation: 993
I need to do a lot of different preprocessing of some text data, the preprocessing consists of several simple regex functions all written in class Filters that all take in a String and returns the formatted String. Up until now, in the different classes that needed some preprocessing, I created a new function where I had a bunch of calls to Filters, they would look something like this:
private static String filter(String text) {
text = Filters.removeURL(text);
text = Filters.removeEmoticons(text);
text = Filters.removeRepeatedWhitespace(text);
....
return text;
}
Since this is very repetitive (I would call about 90% same functions, but 2-3 would be different for each class), I wonder if there are some better ways of doing this, in Python you can for example put function in a list and iterate over that, calling each function, I realize this is not possible in Java, so what is the best way of doing this in Java?
I was thinking of maybe defining an enum with a value for each function and then call a main function in Filters with array of enums with the functions I want to run, something like this:
enum Filter {
REMOVE_URL, REMOVE_EMOTICONS, REMOVE_REPEATED_WHITESPACE
}
public static String filter(String text, Filter... filters) {
for(Filter filter: filters) {
switch (filter) {
case REMOVE_URL:
text = removeURL(text);
break;
case REMOVE_EMOTICONS:
text = removeEmoticons(text);
break;
}
}
return text;
}
And then instead of defining functions like shown at the top, I could instead simply call:
filter("some text", Filter.REMOVE_URL, Filter.REMOVE_EMOTICONS, Filter.REMOVE_REPEATED_WHITESPACE);
Are there any better ways to go about this?
Upvotes: 2
Views: 1474
Reputation: 82949
Another way would be to add a method to your enum Filter
and implement that method for each of the enum literals. This will also work with earlier versions of Java. This is closest to your current code, and has the effect that you have a defined number of possible filters.
enum Filter {
TRIM {
public String apply(String s) {
return s.trim();
}
},
UPPERCASE {
public String apply(String s) {
return s.toUpperCase();
}
};
public abstract String apply(String s);
}
public static String applyAll(String s, Filter... filters) {
for (Filter f : filters) {
s = f.apply(s);
}
return s;
}
public static void main(String[] args) {
String s = " Hello World ";
System.out.println(applyAll(s, Filter.TRIM, Filter.UPPERCASE));
}
However, if you are using Java 8 you can make your code much more flexible by just using a list of Function<String, String>
instead. If you don't like writing Function<String, String>
all the time, you could also define your own interface, extending it:
interface Filter extends Function<String, String> {}
You can then define those functions in different ways: With method references, single- and multi-line lambda expressions, anonymous classes, or construct them from other functions:
Filter TRIM = String::trim; // method reference
Filter UPPERCASE = s -> s.toUpperCase(); // one-line lambda
Filter DO_STUFF = (String s) -> { // multi-line lambda
// do more complex stuff
return s + s;
};
Filter MORE_STUFF = new Filter() { // anonymous inner class
// in case you need internal state
public String apply(String s) {
// even more complex calculations
return s.replace("foo", "bar");
};
};
Function<String, String> TRIM_UPPER = TRIM.andThen(UPPERCASE); // chain functions
You can then pass those to the applyAll
function just as the enums and apply them one after the other in a loop.
Upvotes: 2
Reputation: 33010
Given that you already implemented your Filters
utility class you can easily define a list of filter functions
List<Function<String,String>> filterList = new ArrayList<>();
filterList.add(Filters::removeUrl);
filterList.add(Filters::removeRepeatedWhitespace);
...
and then evaluate:
String text = ...
for (Function<String,String> f : filterList)
text = f.apply(text);
A variation of this, even easier to handle:
Define
public static String filter(String text, Function<String,String>... filters)
{
for (Function<String,String> f : filters)
text = f.apply(text);
return text;
}
and then use
String text = ...
text = filter(text, Filters::removeUrl, Filters::removeRepeatedWhitespace);
Upvotes: 3
Reputation: 10403
You could do this in Java 8 pretty easily as @tobias_k said, but even without that you could do something like this:
public class FunctionExample {
public interface FilterFunction {
String apply(String text);
}
public static class RemoveSpaces implements FilterFunction {
public String apply(String text) {
return text.replaceAll("\\s+", "");
}
}
public static class LowerCase implements FilterFunction {
public String apply(String text) {
return text.toLowerCase();
}
}
static String filter(String text, FilterFunction...filters) {
for (FilterFunction fn : filters) {
text = fn.apply(text);
}
return text;
}
static FilterFunction LOWERCASE_FILTER = new LowerCase();
static FilterFunction REMOVE_SPACES_FILTER = new RemoveSpaces();
public static void main(String[] args) {
String s = "Some Text";
System.out.println(filter(s, LOWERCASE_FILTER, REMOVE_SPACES_FILTER));
}
}
Upvotes: 3