Alfredo Pipoli
Alfredo Pipoli

Reputation: 21

Java .split() out of bounds

I have a problem with my code.

I'm trying to extract the name of the channels from a .txt file. I can't understand why the method line.split() give me back an array with 0 length:

Someone can help me?

This is the file .txt:

------------[channels.txt]---------------------

...
#CH id="" tvg-name="Example1" tvg-logo="http... 
#CH id="" tvg-name="Example2" tvg-logo="http...
#CH id="" tvg-name="Example3" tvg-logo="http...
#CH id="" tvg-name="Example4" tvg-logo="http...
...

This is my code:

try {
    FileInputStream VOD = new FileInputStream("channels.txt");
    BufferedReader buffer_r = new BufferedReader(new InputStreamReader(VOD));
    String line;
    ArrayList<String> name_channels = new ArrayList<String>();

    while ((line = buffer_r.readLine()) != null ) {
        if (line.startsWith("#")) {
            String[] first_scan = line.split(" tvg-name=\" ", 2);
            String first = first_scan[1];               // <--- out of bounds

            String[] second_scan = first.split(" \"tvg-logo= ", 2);
            String second = second_scan[0];

            name_channels.add(second);

        } else {
            //...           
        }
    }
    for (int i = 0; i < name_channels.size(); i++) {
        System.out.println("Channel: " + name_channels.get(i));
    }
} catch(Exception e) {
    System.out.println(e);
}

Upvotes: 0

Views: 102

Answers (2)

The fourth bird
The fourth bird

Reputation: 163207

There is a whitespace after the last double quote in tvg-name=\" which does not match the data in your example.

When you use split with line.split(" tvg-name=\"", 2) then the first item in the returned array will be #CH id="" and the second part will be Example1" tvg-logo="http..."

If you want to get the value of tvg-name= you might use a regex with a capturing group where you would capture not a double quote using a negated character class [^"]+

tvg-name="([^"]+)"

try {
    FileInputStream VOD = new FileInputStream("channels.txt");
    BufferedReader buffer_r = new BufferedReader(new InputStreamReader(VOD));
    String line;
    ArrayList<String> name_channels = new ArrayList<String>();

    while((line = buffer_r.readLine()) != null ){
        if(line.startsWith("#")){
            String regex = "tvg-name=\"([^\"]+)\"";
            Pattern pattern = Pattern.compile(regex);
            Matcher matcher = pattern.matcher(line);

            while (matcher.find()) {
                name_channels.add(matcher.group(1));
            }
        } else {
            // ...
        }
    }
    for(int i = 0; i < name_channels.size(); i++){
        System.out.println("Channel: " + name_channels.get(i));
    }
}catch(Exception e){
    System.out.println(e);
}

Upvotes: 0

Tom Hawtin - tackline
Tom Hawtin - tackline

Reputation: 147124

So you have examples like this

#CH id="" tvg-name="Example1" tvg-logo="http... 

And are trying to split on these strings

" tvg-name=\" "
" \"tvg-logo= "

Neither of those strings are in the example. There's a spurious space appended, and the space at the start of the second is in the wrong place.

Fix the strings and here's a concise but complete program to demonstrate

interface Split {
    static void main(String[] args) {
        String line = "#CH id=\"\" tvg-name=\"Example1\" tvg-logo=\"http...";

        String[] first_scan = line.split(" tvg-name=\"", 2);
        String first = first_scan[1];               // <--- out of bounds

        String[] second_scan = first.split("\" tvg-logo=", 2);
        String second = second_scan[0];

        System.err.println(second);
    } 
}

Of course, if you have any lines that start with '#' but don't match, you'll have a similar problem.

This sort of thing is probably done better with regexs and capturing groups.

Upvotes: 1

Related Questions