user2669150
user2669150

Reputation: 49

Get search result of google in java

I got 403 response code in this program, but I need to get 200 to getting back the search result, what can I do?

      String url="http://www.google.com/search?q=";
      String charset="UTF-8";
      String key="java";
      String query = String.format("%s",URLEncoder.encode(key, charset));
      URLConnection con = new URL(url+ query).openConnection();
      BufferedReader in = new BufferedReader(new InputStreamReader(con.getInputStream()));
      String inputLine;
      while ((inputLine = in.readLine()) != null) 
      System.out.println(inputLine);
      in.close();

Upvotes: 1

Views: 6629

Answers (4)

Grooveek
Grooveek

Reputation: 10094

403 response is clear enough. Google servers tells you the way you're doing things is not a way that is authorized, nor tolerated.

Google prohibits the use of automated queries and using it is at your own risk of being blocked at any time.

If you want to go down this road, you'll have to understand why you are blocked (User-agent, IP adress, Header fingerprinting, etc. There are a lot of means for them to know if you're a bot or not)

Upvotes: 3

Hartator
Hartator

Reputation: 5155

As an alternative to JSoup, you can use this package.

Code sample:

Map<String, String> parameter = new HashMap<>();
parameter.put("q", "Coffee");
parameter.put("location", "Portland");
GoogleSearchResults serp = new GoogleSearchResults(parameter);

JsonObject data = serp.getJson();
JsonArray results = (JsonArray) data.get("organic_results");
JsonObject first_result = results.get(0).getAsJsonObject();
System.out.println("first coffee: " + first_result.get("title").getAsString());

Upvotes: 0

MariuszS
MariuszS

Reputation: 31595

Try with JSoup

Document document = Jsoup
        .connect("http://www.google.com/search?q=" + query)
        .userAgent("Mozilla/5.0 (Windows NT 6.1; WOW64; rv:5.0) Gecko/20100101 Firefox/5.0")     
        .get();

System.out.println(document.html());

For extracting links use selector api.

Dependency:

<dependency>
  <!-- jsoup HTML parser library @ http://jsoup.org/ -->
  <groupId>org.jsoup</groupId>
  <artifactId>jsoup</artifactId>
  <version>1.7.3</version>
</dependency>

Upvotes: 4

gfelisberto
gfelisberto

Reputation: 1723

Google is blocking the default UserAgent sent by Java. You can use another one and trick Google. Simply add:

con.setRequestProperty("User-Agent", "Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.4; en-US; rv:1.9.2.2) Gecko/20100316 Firefox/3.6.2");

after creating the con and before starting to read.

Upvotes: 1

Related Questions