Reputation: 1550
I have a java application that needs to parse HTML elements from an HTML page. My simple HTML test is setup as such:
<!DOCTYPE html>
<html>
<head>
<style type='text/css'>
div {width:100%;height:100px;background-color:blue;}
</style>
</head>
<body>
<div></div>
</body>
</html>
My code will be setup such that it will search the document for this string: "<style"
And then search for the closing carot: ">" because the user may have typed any of these combinations for their HTML file:
<style type="text/css">
or
<style type = "text/css" >
or
<style type = 'text/css' >
or
<style type='text/css'>
etc..
So my method is to find the "style" tag and everything up to its closing carot
Then find the closing style tag:
</style>
Then grab everything between those two entities.
Here's my files with their code:
************strings.xml************
String txt_style_opentag = "<style"
String txt_end_carrot = ">"
String txt_style_closetag = "</style>"
***********************************
************Parser.java************
public static String getStyle(Context context, String text) {
String style = "";
String openTag = context.getString(R.string.txt_style_opentag);
String closeTag = context.getString(R.string.txt_style_closetag);
String endCarrot = context.getString(R.string.txt_end_carrot);
int openPos1 = text.indexOf(openTag);
int openPos = text.indexOf(endCarrot, openPos1);
int closePos = text.indexOf(closeTag, openPos1);
if (openPos != -1 && closePos != -1)
style = text.substring(openPos + openTag.length(), closePos).trim();
if (style != null && style.length() > 0 && style.charAt(0) == '\n') // first \n remove
style = style.substring(1, style.length());
if (style != null && style.length() > 0 && style.charAt(style.length() - 1) == '\n') // last \n remove
style = style.substring(0, style.length() - 1);
return style;
}
********************************************************
My result is close, but not right. The result is this:
{width:100%;height:100px;background-color:blue;}
If you notice, it is missing the "div" part. It should look like this:
div {width:100%;height:100px;background-color:blue;}
What am I doing wrong here. Can anyone help?
Upvotes: 1
Views: 227
Reputation: 5490
You're taking the substring from the end of your opening tag (the closing bracket >
) and adding the length of the opening tag (rather than endCarrot
), thus moving the start of the substring ahead of where you want it to be. You want to do
style = text.substring(openPos + endCarrot.length(), closePos).trim();
Upvotes: 1
Reputation: 1550
Of course...right after I ask for help I finally figure it out. The following code should be changed
FROM:
style = text.substring(openPos + openTag.length(), closePos).trim();
TO:
style = text.substring(openPos + endCarrot.length(), closePos).trim();
Sorry for the post. And thanks for the recommendations
Upvotes: 0