Warosaurus
Warosaurus

Reputation: 530

Java 8 Streams modify collection values

Using the stream API; once the relevant data has been filtered I'd like to edit the data being collected. Here is the code so far:

  String wordUp = word.substring(0,1).toUpperCase() + word.substring(1);
  String wordDown = word.toLowerCase();

  ArrayList<String> text = Files.lines(path)
        .parallel() // Perform filtering in parallel
        .filter(s -> s.contains(wordUp) || s.contains(wordDown) &&  Arrays.asList(s.split(" ")).contains(word))
        .sequential()
        .collect(Collectors.toCollection(ArrayList::new));

Edit The code below is awful and I am trying to avoid it.(It also does not entirely work. It was done at 4am, please excuse it.)

    for (int i = 0; i < text.size(); i++) {
        String set = "";
        List temp = Arrays.asList(text.get(i).split(" "));
        int wordPos = temp.indexOf(word);

        List<String> com1 = (wordPos >= limit) ? temp.subList(wordPos - limit, wordPos) : new ArrayList<String>();
        List<String> com2 = (wordPos + limit < text.get(i).length() -1) ? temp.subList(wordPos + 1, wordPos + limit) : new ArrayList<String>();
        for (String s: com1)
            set += s + " ";
        for (String s: com2)
            set += s + " ";
        text.set(i, set);
    }

It's looking for a particular word in a text file, once the line has been filtered in I'd like to only collect a portion of the line every time. A number of words on either side of the keyword that is being searched for.

eg:

keyword = "the" limit = 1

It would find: "Early in the morning a cow jumped over a fence."

It should then return: "in the morning"

*P.S. Any suggested speed improvements will be up-voted.

Upvotes: 0

Views: 8502

Answers (1)

Holger
Holger

Reputation: 298123

There are two different tasks you should think about. First, convert a file into a list of words:

List<String> words = Files.lines(path)
    .flatMap(Pattern.compile(" ")::splitAsStream)
    .collect(Collectors.toList());

This uses your initial idea of splitting at space characters. This might be sufficient for simple tasks, however, you should study the documentation of BreakIterator to understand the difference between this simple approach and a real, sophisticated word boundary splitting.

Second, if you have a list of words, your task is to find matches of your word and convert sequences of items around the match into a single match String by joining the words using a single space character as delimiter:

List<String> matches=IntStream.range(0, words.size())
    // find matches
    .filter(ix->words.get(ix).matches(word))
    // create subLists around the matches
    .mapToObj(ix->words.subList(Math.max(0, ix-1), Math.min(ix+2, words.size())))
    // reconvert lists into phrases (join with a single space
    .map(list->String.join(" ", list))
    // collect into a list of matches; here, you can use a different
    // terminal operation, like forEach(System.out::println), as well
    .collect(Collectors.toList());

Upvotes: 7

Related Questions