Reputation: 35
I hope you could help me with this...
First off, some HTML Code...
<div style="font-size:12px; FONT-FAMILY: VERDANA, ARIAL; font-weight:bold; position: absolute; background-color:#FFFFFF; border-color:#868686; border-style:solid; border-left-width:1px; border-top-width:1px; border-right-width:0px; border-bottom-width:0px; top: 0px; left:2px; width:55px;height:30px; padding-left : 0 px;padding-top : 5px;"><img src="../../server/img/spacer.gif" alt="" width="10" height="1">Zeit</div>
<div style="font-size:12px; FONT-FAMILY: Tahoma, VERDANA, ARIAL; font-weight:bold; position: absolute; top: 0px; background-color:#FFFFFF; border-color:#868686; border-style:solid; border-left-width:1px; border-top-width:1px; border-bottom-width:0px;border-right-width:1px;left:57px;width:140px;height:30px; padding-left:0px; padding-top:5px;"><img src="../../server/img/spacer.gif" alt="" width="10" height="1">Montag</div>
<div style="font-size:12px; FONT-FAMILY: Tahoma, VERDANA, ARIAL; font-weight:bold; position: absolute; top: 0px; background-color:#FFFFFF; border-color:#868686; border-style:solid; border-left-width:1px; border-top-width:1px; border-bottom-width:0px;border-right-width:1px;left:197px;width:140px;height:30px; padding-left:0px; padding-top:5px;"><img src="../../server/img/spacer.gif" alt="" width="10" height="1">Dienstag</div>
<div style="font-size:12px; FONT-FAMILY: Tahoma, VERDANA, ARIAL; font-weight:bold; position: absolute; top: 0px; background-color:#FFFFFF; border-color:#868686; border-style:solid; border-left-width:1px; border-top-width:1px; border-bottom-width:0px;border-right-width:1px;left:337px;width:140px;height:30px; padding-left:0px; padding-top:5px;"><img src="../../server/img/spacer.gif" alt="" width="10" height="1">Mittwoch</div>
<div style="font-size:12px; FONT-FAMILY: Tahoma, VERDANA, ARIAL; font-weight:bold; position: absolute; top: 0px; background-color:#FFFFFF; border-color:#868686; border-style:solid; border-left-width:1px; border-top-width:1px; border-bottom-width:0px;border-right-width:1px;left:477px;width:140px;height:30px; padding-left:0px; padding-top:5px;"><img src="../../server/img/spacer.gif" alt="" width="10" height="1">Donnerstag</div>
<div style="font-size:12px; FONT-FAMILY: Tahoma, VERDANA, ARIAL; font-weight:bold; position: absolute; top: 0px; background-color:#FFFFFF; border-color:#868686; border-style:solid; border-left-width:1px; border-top-width:1px; border-bottom-width:0px;border-right-width:1px;left:617px;width:140px;height:30px; padding-left:0px; padding-top:5px;"><img src="../../server/img/spacer.gif" alt="" width="10" height="1">Freitag</div>
My first problem was to look where the day "Montag" (=Monday) is...till now I got this:
Element content = doc.getElementById("content");
Elements names = doc.select("div[style]");
for(Element elem : names){
if(elem.text().contains("Montag")){
}
}
Do you think it's okay this way?
Right after this,(in the if statement) I have to look after the style inline attribute: "left: [xx]px".
So how I can achieve the following output?:
Montag -> Left:57px
I hope for your help! thank you a lot for may taking the time to answer me.
Upvotes: 1
Views: 3983
Reputation: 8879
You can definitely use Jsoup the way you do it to find the correct element.
To get the attribute information, there is no simple way to do this using only Jsoup. You can get the attributes by calling the Element.attributes()
method in Jsoup, but as far as I know you will have to use a regex matcher to select the information you want.
You can set up a regex lookahead and lookbehind pattern that will check for occurences that matches your pattern.
Pattern p = Pattern.compile("(?<=border-right-width:1px;)(.*)(?=;width:140px;)");
This pattern will look for all characters that are between border-right-width:1px;
and ;width:140px;
Going from this, the code below should produce your desired result:
Pattern p = Pattern.compile("(?<=border-right-width:1px;)(.*)(?=;width:140px;)");
String elementInformation = "";
for (Element elem : names) {
if (elem.text().contains("Montag")) {
Matcher m = p.matcher(elem.attributes().toString());
elementInformation = elem.text() + " -> ";
while(m.find()){
elementInformation += m.group();
}
}
}
System.out.println(elementInformation);
Result:
Montag -> left:57px
You can modify the for each loop and parse the same information for all elements, though it
for (Element elem : names) {
if (!elem.text().contains("Zeit")) {
Matcher m = p.matcher(elem.attributes().toString());
elementInformation += "\n";
elementInformation += elem.text() + " -> ";
while (m.find()) {
elementInformation += m.group();
}
}
}
and you'll get:
Montag -> left:57px
Dienstag -> left:197px
Mittwoch -> left:337px
Donnerstag -> left:477px
Freitag -> left:617px
Take a look at this Regex tutorial if you want to learn how it works.
Upvotes: 2