Reputation: 9
So I've been trying to parse the cities from this site:https://en.wikipedia.org/wiki/List_of_cities_in_Switzerland
I'm new to jsoup ,so I tried to fetch the names of the cities ,but I get every element of the city.
Document doc = Jsoup.connect("https://en.wikipedia.org/wiki/List_of_cities_in_Switzerland").userAgent("Mozilla").get();
String title = doc.title();
Elements test = doc.select("table.wikitable").select("tbody").select("tr");
for (Element link : test) {
Elements temp = link.select("td").select("a");
System.out.println(temp.text());
}
For example,I get this Aarberg Aarberg Bern,while I wanted just Aarberg
Upvotes: 0
Views: 31
Reputation: 124225
You are overcomplicating things by adding so many select
invocations. You can simplify your code by using one select
in which you point each elements you want to find. Use space to describe ancestors-descendent relationship.
Anyway select("td")
picks every td
in selected tr
. Then you are gathering each a
link in these selected td
.
To pick only first td
in each tr
you can use selector td:eq(0)
. Then you can pick each a
from each first td
.
Anyway your code be simplified to something more like:
Elements links = doc.select("table.wikitable tr td:eq(0) a");
for (Element link : links) {
System.out.println(link.text());
}
To learn more about selectors visit http://jsoup.org/cookbook/extracting-data/selector-syntax where you can find description of :eq(n)
:eq(n)
: find elements whose sibling index is equal ton
; e.g.form input:eq(1)
Upvotes: 3