Reputation: 235
I want to get data from this web site with web scraping. http://myservices.ect.nl/tracing/objectstatus/Pages/Overview.aspx:
I used JSoup before for more static HTML sites, but this one is more difficult because before I get the HTML table on the site have to click one button and I don't know if it's possible to use JSoup to manipulate the button.
After click this button I get a HTML table, I want to get data only where modality is Barge.
Thank you for your tip to use Firefox, now I have the table with some another page information. Can you tell me how can i get only table information? Output that I get is as follows:
Upvotes: 3
Views: 1973
Reputation: 17461
You will have to use Selenium
HTML Unit Driver for that.
Here is full working example
. It will visit the website
,click
the button and then you can get the data
from the page.
Edit: Only get the table value
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.support.ui.Select;
public class GetData {
public static void main(String args[]) throws InterruptedException {
WebDriver driver = new FirefoxDriver();
driver.get("http://myservices.ect.nl/tracing/objectstatus/Pages/Overview.aspx");
Thread.sleep(5000);
// select barge
new Select(driver.findElement(By.id("ctl00_ctl15_g_ce17bd4b_3803_47f6_822a_2b8dd10fc67d_ctl00_dlModality"))).selectByVisibleText("Barge");
// click button
Thread.sleep(3000);
driver.findElement(By.className("button80")).click();
Thread.sleep(5000);
//get only table text
WebElement findElement = driver.findElement(By.className("grid-view"));
String htmlTableText = findElement.getText();
// do whatever you want now, These are raw table values.
System.out.println(htmlTableText);
driver.close();
driver.quit();
}
}
Upvotes: 3
Reputation: 14061
Every "click" (or any interaction of that sort) is a request to the server and a response to the browser. So, a possible solution is not to use JSoup for the initial page, but for the result page. For instance, open a POST to the page that returns the table, passing the parameter responsible for returning the modality Barge
. You can use a tool like Firebug (for Firefox) or Chrome Developer Tools to check what's the conversation (request/response), so that you can emulate that with your own code.
Upvotes: 2