Reputation: 6522
Title is weird, i know. I don't know how to explain it better.
I am using jsoup to parse XML. This is how I did it until now:
Elements dan = doc.select("table.ednevnik-seznam_ur_teden tbody tr:eq(0) th:eq("+i+") div");
So I am parsing elements that have class name "ednevnik-seznam_ur_teden" ...
But now the website administator changed it so that every day there is a different class name. It still starts with "ednevnik-seznam_ur_teden" but there is something added to it.
Example: "ednevnik-seznam_ur_teden 123"
Is it possible to only look for a certain beginning of class name and then parse it? For example if the calss name begins with "ednevnik-seznam", it will parse it.
Something that can replace everything, like % in SQL.
EDIT: This is how I changed the code and it still won't work:
Elements dan = doc.select("table[class^=ednevnik-seznam_ur_teden] tbody tr:eq(0) th:eq("+i+") div");
In-depth explanation:
I'd gladly post HTML code but I don't have access to it. And if I inspect element and then copy/paste it from there, the code is too messed up. So this is the website from which I want to parse HTML: https://www.easistent.com/urniki/cc45c5d0d303f954588402a186f5cdba5edb51d6/razredi/28396
This is the code with which I was doing it up until now.
for (int i = 1; i <= 5; i++)
{
Elements dan = doc.select("table[class^=ednevnik-seznam_ur_teden] tbody tr:eq(0) th:eq("+i+") div");
for (int b = 1; b <= 11; b++)
{
Elements predmeti = doc.select("table[class^=ednevnik-seznam_ur_teden] tbody tr:eq("+b+") td:eq("+predmet+") td[class=text14 bold]");
Elements ucilnice = doc.select("table[class^=ednevnik-seznam_ur_teden] tbody tr:eq("+b+") td:eq("+predmet+") div[class=text11]");
...
So basically what I do with this, I parse every table element specifically.
This was working until something changed on the website. Now table column elements don't have the same class all the time anymore. They used to always be "ednevnik-seznam_ur_teden" but now they change accordnig to which day it is. The column that is currently highlighted has a different class name.
So now this html parser code parses everything except the highlighted column.
Upvotes: 2
Views: 1318
Reputation: 4840
You should be able to use a CSS selector to select items that start with "ednevnik-seznam_ur_teden".
In your case, your selector would be:
Elements dan = doc.select("table[class^=ednevnik-seznam_ur_teden]");
This says to select all tables that have a class name that starts with ednevnik-seznam_ur_teden
.
And for reference, I find the jsoup Selector documentation very helpful with examples.
Upvotes: 1