Reputation: 11363
For my data structures class, the first project requires a text file of songs to be parsed.
An example of input is:
ARTIST="unknown"
TITLE="Rockabye Baby"
LYRICS="Rockabye baby in the treetops
When the wind blows your cradle will rock
When the bow breaks your cradle will fall
Down will come baby cradle and all
"
I'm wondering the best way to extract the Artist, Title and Lyrics to their respective string fields in a Song class. My first reaction was to use a Scanner, take in the first character, and based on the letter, use skip() to advance the required characters and read the text between the quotation marks.
If I use this, I'm losing out on buffering the input. The full song text file has over 422K lines of text. Can the Scanner handle this even without buffering?
Upvotes: 1
Views: 1512
Reputation: 50127
In this case, you could use a CSV reader, with the field separator '=' and the field delimiter '"' (double quote). It's not perfect, as you get one row for ARTIST, TITLE, and LYRICS.
Upvotes: 1
Reputation: 205865
If the source data can be parsed using one token look ahead, StreamTokenizer
may be a choice. Here is an example that compares StreamTokenizer
and Scanner
.
Upvotes: 1
Reputation: 6631
For something like this, you should probably just use Regular Expressions. The Matcher class supports buffered input.
The find method takes an offset, so you can just parse them at each offset.
http://download.oracle.com/javase/1.4.2/docs/api/java/util/regex/Matcher.html
Regex is a whole world into itself. If you've never used them before, start here http://download.oracle.com/javase/tutorial/essential/regex/ and be prepared. The effort is so very worth the time required.
Upvotes: 3