Dor Birendorf
Dor Birendorf

Reputation: 107

How to split and parse a text file in Java

For a school project I need to pull messages from a text file. I created a message class:

public class Message {

    private String from;

    private String to;

    private String body;

    public Message(String from, String to, String body) {
        this.from = from;
        this.to = to;
        this.body = body;
    }
}

The text file looks like this:

From: sender
To: Address
blah blah blah
blah blah(can be more then one line)
#(represent end of message)
From: sender2
To: Address2
blah blah blah
blah blah(can be more then one line)
#

I need to create an ArrayList of messages from that text file, but I'm not sure how to split it. Just to clarify, the sender, addressee and body are separated with new line and the messages end with '#'.

Upvotes: 0

Views: 1465

Answers (2)

Oleg Cherednik
Oleg Cherednik

Reputation: 18245

You can modify your Message class:

class Message {

    private String from = "";
    private String to = "";
    private String body = "";

    public void setFrom(String from) {
        this.from = from;
    }

    public void setTo(String to) {
        this.to = to;
    }

    public void addBody(String body) {
        if (!this.body.isEmpty())
            this.body += '\n';
        this.body += body;
    }
}

And then just read all lines form you text file, and line by line create Message instance:

private static List<Message> getMessages(List<String> lines) {
    final String separator = "#";
    final String from = "From:";
    final String to = "To:";

    Message message = null;
    List<Message> messages = new ArrayList<>();

    for (String line : lines) {
        if (line.startsWith(separator))
            message = null;
        else {
            if (message == null)
                messages.add(message = new Message());

            if (line.startsWith(from))
                message.setFrom(line.substring(from.length()).trim());
            else if (line.startsWith(to))
                message.setTo(line.substring(to.length()).trim());
            else
                message.addBody(line);
        }
    }

    return messages;
}

P.S. To read text file as list of ines, do use e.g. List<String> lines = Files.readAllLines(Paths.get("data.txt"));

Upvotes: 1

Georg Muehlenberg
Georg Muehlenberg

Reputation: 557

I wrote parse(), a parsing method for your Message class. I also wrote a simple test in main() to demonstrate how to split the text file into separate messages. Please note that this solution has limitations. It keeps the whole text file in memory as String. Should the text file be one or more GB large, there has to be found a Stream processing solution along the lines of this question.

import org.apache.commons.lang3.StringUtils;

import java.util.ArrayList;
import java.util.List;

public class Message {

    private String from;
    private String to;
    private String body;

    public Message(String from, String to, String body) {
        this.from = from;
        this.to = to;
        this.body = body;
    }

    public String toString() {
        return "From: " + from + "\n" +
                "To: " + to + "\n" +
                "Body: " + body;
    }

    // creates a messsage object from a string
    public static Message parse(String msg) {
        if (msg == null || StringUtils.countMatches(msg, "\n") <= 2) {
            throw new IllegalArgumentException("Invalid string! Needing a string with at least 3 lines!");
        }
        // first, find from and to with two splits by new line
        String[] splits = msg.split("\n");
        // replace the non-informative 'From: " beginning, should it be there
        String from = splits[0].replace("From: ", "");
        // replace the non-informative 'To: " beginning, should it be there
        String to = splits[1].replace("To: ", "");
        // the rest is body
        String body = msg.substring(msg.indexOf(to) + to.length() + 1, msg.length());
        // remove leading and trailing whitespaces
        body = StringUtils.trim(body);
        return new Message(from, to, body);
    }

    public static void main(String[] args) {
        List<Message> allMessages = new ArrayList<>();
        String text = "From: sender\n" +
                "To: Address\n" +
                "blah blah blah\n" +
                "blah blah(can be more then one line)\n" +
                "#\n" +
                "From: sender2\n" +
                "To: Address2\n" +
                "blah blah blah\n" +
                "blah blah(can be more then one line)";
        // split the text by what separates messages from each other
        String[] split = text.split("#\n");
        for (String msg : split) {
            allMessages.add(Message.parse(msg));
        }
        // print each message to System.out as a simple means of demonstrating the code
        allMessages.forEach(System.out::println);
    }
}

Upvotes: 3

Related Questions