Reputation:
I am writing a script which involves showing the user a webpage on the basis of his requirements (which I take as inputs) and opening it in firefox. For instance, a crude version of this is :
#!/bin/bash
read -p "What do you want to search" search_term
link=$(echo "http://www.mywebsite_whatever.com/search?q="$search_term)
firefox $link
The major problem is I cannot wget/urllib this website since I don't have permission.
Now what I want to do is :
Have the user look over only certain keywords on the webpage. So for that I want to either:
Open firefox with the find
box (Ctrl + F) ON with the keyword in it (without changing the source code of firefox)
Somehow have firefox open the website, save it as html and quit. (I can't wget) Then I can grep
out keywords as desired. [Please don't start off on how this is unethical and all. I am doing this merely as an exercise]
I am working on Linux.
Upvotes: 1
Views: 3689
Reputation: 30647
Use Wget with the --user-agent
switch so that the website thinks you're using Firefox, for example
wget --user-agent="Mozilla/5.0 (X11; Ubuntu; Linux i686; rv:12.0) Gecko/20100101 Firefox/12.0"
Of course, for a permanent script you should instead use --user-agent="MyScript/1.0 (http://mywebsite/)"
or similar so that if it goes haywire they know who to contact.
Upvotes: 1
Reputation: 8587
To control a browser in your program, try Selenium. It supports Java
, Python
, etc.
See example source code from: http://seleniumhq.org/docs/03_webdriver.html
package org.openqa.selenium.example;
import org.openqa.selenium.By;
import org.openqa.selenium.WebDriver;
import org.openqa.selenium.WebElement;
import org.openqa.selenium.firefox.FirefoxDriver;
import org.openqa.selenium.support.ui.ExpectedCondition;
import org.openqa.selenium.support.ui.WebDriverWait;
public class Selenium2Example {
public static void main(String[] args) {
// Create a new instance of the Firefox driver
// Notice that the remainder of the code relies on the interface,
// not the implementation.
WebDriver driver = new FirefoxDriver();
// And now use this to visit Google
driver.get("http://www.google.com");
// Alternatively the same thing can be done like this
// driver.navigate().to("http://www.google.com");
// Find the text input element by its name
WebElement element = driver.findElement(By.name("q"));
// Enter something to search for
element.sendKeys("Cheese!");
// Now submit the form. WebDriver will find the form for us from the element
element.submit();
// Check the title of the page
System.out.println("Page title is: " + driver.getTitle());
// Google's search is rendered dynamically with JavaScript.
// Wait for the page to load, timeout after 10 seconds
(new WebDriverWait(driver, 10)).until(new ExpectedCondition<Boolean>() {
public Boolean apply(WebDriver d) {
return d.getTitle().toLowerCase().startsWith("cheese!");
}
});
// Should see: "cheese! - Google Search"
System.out.println("Page title is: " + driver.getTitle());
//Close the browser
driver.quit();
}
}
Upvotes: 4