Reputation: 93
keywordexist = false;
try {
res = Jsoup
.connect(
bingSearchUrl.replaceAll("keyword", "intitle:\""
+ keyword + "\""))
.userAgent(
"Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.15 (KHTML, like Gecko) Chrome/24.0.1295.0 Safari/537.15")
.referrer("http://www.bing.com")
.method(Connection.Method.GET).execute();
doc = res.parse();
System.out.println(bingSearchUrl.replaceAll("keyword", "intitle:\""
+ keyword + "\""));
elements = doc.select("li[class^=b_algo]");
System.out.println(doc.html());
System.out.println(elements.html());
// String divContents =
// doc.select(".id-app-orig-desc").first().text();
// elements.remove("div");
if (elements.html().contains("<strong>" + keyword + "</strong>")) {
keywordexist = true;
System.out.println("keyword exists");
}
} catch (IOException e) {
// TODO Auto-generated catch block
e.printStackTrace();
}
I'm trying to use jsoup to check a list of keywords I have in Bing Search but whenever I run my program jsoup will always connect to Bing's captcha page, is there any way I can avoid this? I thought this would be remedied by adding a useragent and referrer but it doesn't seem to have any effect.
Upvotes: 3
Views: 1445
Reputation: 43013
I used a code similar to yours and get all the results. However here are two points I noticed:
I think you should slow down between two searches. For example, add a random pause from 3000 to 5000 ms.
Don't forget to escape the query parameters
String bingSearchUrl = "http://www.bing.com/search?q=keyword";
String keyword = "stackoverflow jsoup";
String uaString = "Mozilla/5.0 (Windows NT 6.2; WOW64) AppleWebKit/537.15 (KHTML, like Gecko) Chrome/24.0.1295.0 Safari/537.15";
String url = bingSearchUrl.replaceAll("keyword", URLEncoder.encode("intitle:\"" + keyword + "\"", "UTF-8"));
Document doc = Jsoup.connect(url).userAgent(uaString).get();
System.out.println(doc.select("li h2"));
Upvotes: 2