Setekorrales
Setekorrales

Reputation: 105

Jsoup Scraping HTML dynamic content

I'm new to Jsoup and I have been trying to create a small code that gets the name of the items in a steam inventory using Jsoup.

public Element getItem(String user) throws IOException{
    Document doc;

    doc = Jsoup.connect("http://steamcommunity.com/id/"+user+"/inventory").get();
    Element element = doc.getElementsByClass("hover_item_name").first();
    return element;
}

this methods returns:

<h1 class="hover_item_name" id="iteminfo0_item_name"></h1>

and I want the information beetwen the "h1" labels which is generated when you click on a specific window. Thank you in advance.

Upvotes: 3

Views: 3472

Answers (2)

Mad Matts
Mad Matts

Reputation: 1116

You can use the .select(String cssQuery) method:

doc.select("h1") gives you all h1 Elements. If you need the actual Text in these tags use the .text() for each Element. If you need a attribute like class or id use .attr(String attributeKey) on a Element eg:

doc.getElementsByClass("hover_item_name").first().attr("id")

gives you "iteminfo0_item_name"

But if you need to perform clicks on a website you can't do that with JSoup, hence JSoup is a HTML parser and not a browser alternative. Jsoup can't handle dynamic content.

But what you could do is, firstly scrape the relevant data in your h1 tags and then send a new .post() request, respectively an ajax call

If you rather want a real webdriver, have a look at Selenium.

Upvotes: 2

Pedro Lobito
Pedro Lobito

Reputation: 98901

Use .text() and return a String, i.e.:

public String getItem(String user) throws IOException{
    Document doc;
    doc = Jsoup.connect("http://steamcommunity.com/id/"+user+"/inventory").get();
    Element element = doc.getElementsByClass("hover_item_name").first();
    String text = element.text();
    return text;
}

Upvotes: 0

Related Questions