kakaman997
kakaman997

Reputation: 9

Jsoup parses more fields than necessary ,used in java

So I've been trying to parse the cities from this site:https://en.wikipedia.org/wiki/List_of_cities_in_Switzerland

I'm new to jsoup ,so I tried to fetch the names of the cities ,but I get every element of the city.

Document doc = Jsoup.connect("https://en.wikipedia.org/wiki/List_of_cities_in_Switzerland").userAgent("Mozilla").get();
String title = doc.title();

Elements test = doc.select("table.wikitable").select("tbody").select("tr");

for (Element link : test) {
    Elements temp = link.select("td").select("a");
    System.out.println(temp.text());

}

For example,I get this Aarberg Aarberg Bern,while I wanted just Aarberg

Upvotes: 0

Views: 31

Answers (1)

Pshemo
Pshemo

Reputation: 124225

You are overcomplicating things by adding so many select invocations. You can simplify your code by using one select in which you point each elements you want to find. Use space to describe ancestors-descendent relationship.

Anyway select("td") picks every td in selected tr. Then you are gathering each a link in these selected td.

To pick only first td in each tr you can use selector td:eq(0). Then you can pick each a from each first td.

Anyway your code be simplified to something more like:

Elements links = doc.select("table.wikitable tr td:eq(0) a");

for (Element link : links) {
    System.out.println(link.text());
}

To learn more about selectors visit http://jsoup.org/cookbook/extracting-data/selector-syntax where you can find description of :eq(n)

:eq(n): find elements whose sibling index is equal to n; e.g. form input:eq(1)

Upvotes: 3

Related Questions