kaozbender
kaozbender

Reputation: 95

.txt file to arrays using Java

I have a .txt file containing document information (For 1400 documents). Each document has an ID, title, author, area and abstract. A sample looks like this:

.I 1
.T
experimental investigation of the aerodynamics of a
wing in a slipstream .
.A
brenckman,m.
.B
j. ae. scs. 25, 1958, 324.
.W
experimental investigation of the aerodynamics of a
wing in a slipstream .
  [...]
the specific configuration of the experiment .

I want to put each of these into 5 arrays dedicated to each category. I'm having trouble inserting the title and abstract into a single array position, can anyone tell me what's wrong with this code? What I am trying to do is insert the text lines into position x after a ".T" is read and stop when it finds a ".A", when it happens, increase position by 1 for it to fill the next position

try{
    collection = new File (File location);
    fr = new FileReader (collection);
    br = new BufferedReader(fr);
    String numDoc = " ";
    int pos = 0;
    while((numDoc=br.readLine())!=null){
        if(numDoc.contains(".T")){
            while((numDoc=br.readLine())!= null && !numDoc.contains(".A")){
                Title[pos] = Title[pos] + numDoc; 
                pos++;
           }

        }
    }
}
catch(Exception e){
     e.printStackTrace();
}

The goal is to have all the information within a single line of String. Any help would be greatly appreciated.

Upvotes: 9

Views: 230

Answers (3)

Compass
Compass

Reputation: 5937

A code walkthrough is always helpful. In the future, you can probably use breakpoints, but I think I know why you're getting what I assume is a Null Pointer Exception.

while((numDoc=br.readLine())!=null){
    if(numDoc.contains(".T")){
        while((numDoc=br.readLine())!= null && !numDoc.contains(".A")){

Outside, everything looks good, In this loop is where the things start going bonkers.

            Title[pos] = Title[pos] + numDoc; 

With your provided input, we would set:

Title[0] as Title[0] + "experimental investigation of the aerodynamics of a"

This works only if Title[0] exists, which I don't assume it has been initialized, yet. We'll address that issue first by correctly detecting for a null array value. This would either be a compiler error about something not being initialized or a run-time null pointer exception. Off the top of my head, I want to say compiler error.

So anyways, we'll address dealing with null Title[pos].

while((numDoc=br.readLine())!=null){
    if(numDoc.contains(".T")){
        while((numDoc=br.readLine())!= null && !numDoc.contains(".A")){
            if(Title[pos] != null) {
                Title[pos] = Title[pos] + numDoc; 
            }
            else {
                Title[pos] = numDoc;
            }
            pos++;
       }
    }
}

When we do another walkthrough, we'll get the following array values

Title[0]=experimental investigation of the aerodynamics of a

Title[1]=wing in a slipstream .

If this intended, then this is fine. If you wanted the titles together, then you move the pos++ out the while loop.

while((numDoc=br.readLine())!=null){
    if(numDoc.contains(".T")){
        while((numDoc=br.readLine())!= null && !numDoc.contains(".A")){
            if(Title[pos] != null) {
                Title[pos] = Title[pos] + " " + numDoc; // add a space between lines
            }
            else {
                Title[pos] = numDoc;
            }
       }
       pos++;
    }
}

Then we get:

Title[0]=experimental investigation of the aerodynamics of a wing in a slipstream .

You may want to trim your inputs, but this should cover both of the potential errors that I can see.

Upvotes: 5

christopher
christopher

Reputation: 27346

Seriously, seriously, seriously, use Objects. Objects allow you to group similar data and when you're handling all these arrays, you really will confuse yourself. More importantly though, you'll confuse the next person who's going to work on your code.

Example

public class Book {
    private String title;
    private String bookAbstract;
    
    public Book(String title, String bookAbstract) {
        this.title = title;
        this.bookAbstract = bookAbstract;
    }
}

I've guessed you're parsing books, so I've created a Book class. Conceptually, this will contain everything to do with books. I've added a title field for the title of the book and an abstract which, as you've guessed, is the book's abstract. This makes your code conceptually much easier to consume, but also much more maintainable. It also makes your goal very simple.

The goal is to have all the information within a single line of String

Parse it and you can use the toString method:

public String toString() {
    return "Title=" + title + "| Abstract=" + abstract;
}

Your specific issue

What you're doing is reading up to a line with .T. Once you hit that line, you know that when a line contains .A, you've got data that you want to use. So, if you read the String Docs, you'll see that there is an indexOf method:

indexOf(int ch, int fromIndex)

Returns the index within this string of the first occurrence of the specified character, starting the search at the specified index.

That fromIndex value is important here. You know what you're looking for (.A) and you know where you're starting from (.T). Using this information, you can jump through the string, dissecting out the useful bits and pass it into your new Book object for parsing.

Upvotes: 4

Scott Hunter
Scott Hunter

Reputation: 49921

Because you increment pos each time you add a non-.A line, those lines will not go into the same element of Title. I think you want to wait to increment pos until you've read the .A-line.

Upvotes: 0

Related Questions