Reputation: 115
I have the following HTML that I wish to find the currently playing artist and song title my regular expression works in http://gskinner.com/RegExr/ and it compiles in Java correctly yet it doesn't match anything
HTML snippet
<div class="audio_playing_title">Currently Playing.
<div class="audio_home_box">
<div class="audio_playing_stats">
<div class="audio_playing">
<div class="audio_dj_title">PRESENTER:
AutoDJ - The Slogan
</div>
<div class="audio_track_title">SONG TITLE:
The Artist Name - Song Name
</div>
</div>
</div>
</div>
The Java code
String data = getWebsiteData(url);
data = data.replace("\\t", "");
Pattern pat = Pattern.compile("<div class=\"audio_track_title\">SONG TITLE:\r(.+)\r</div>");
Matcher matcher = pat.matcher(data);
if (matcher.matches())
{
data = matcher.group(1);
}
else
{
System.out.println("No match");
}
return data;
Upvotes: 1
Views: 108
Reputation: 46219
Your problem is that Matcher#matches()
only returns true if the whole sequence matches your regex.
You need Matcher#find()
, which will look for matching subsequences.
I also think you would be better off using the Pattern#DOTALL
flag to let your .
match line breaks too instead of trying to match them yourself, since the line break standard differs between systems:
Pattern pat = Pattern.compile("<div class=\"audio_track_title\">SONG TITLE:\r(.+)\r</div>", Pattern.DOTALL);
Upvotes: 5