Reputation: 1387
I recently stumbled upon the JSoup library, so I decided to experiment with hit by creating a google query program.
The idea is to type in a Google search, take in the number of queries you want to display, display them, then ask the user for one more integer for input, which is the index that's displayed next to the link.
The problem is that the new Scanner is never called. It prints the prompt and closes.
NOTE: I know I can just go to google myself and search. I'm just experimenting with this new library that scratched that part of my brain that makes me want to look further into something.
Here is the code, and the output -- Sorry if it's sloppy. Still learning...:
import java.io.IOException;
import java.util.Scanner;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class GoogleSearchJava {
static int index;
static String linkHref;
public static final String GOOGLE_SEARCH_URL = "https://www.google.com/search";
public static void main(String[] args) throws IOException {
//GET INPUT FOR SEARCH TERM
Scanner input = new Scanner(System.in);
System.out.print("Search: ");
String searchTerm = input.nextLine();
System.out.print("Enter number of query results: ");
int num = input.nextInt();
String searchURL = GOOGLE_SEARCH_URL + "?q=" + searchTerm + "&num=" + num;
//NEED TO DEFINE USER AGENT TO PREVENT 403 ERROR.
Document document = Jsoup.connect(searchURL).userAgent("Mozilla/5.0").get();
//OPTION TO DISPLAY HTML FILE IN BROWSWER. DON'T KNOW YET.
//System.out.println(doc.html());
//If google search results HTML change the <h3 class="r" to <h3 class ="r1"
//need to change below stuff accordingly
Elements results = document.select("h3.r > a");
index = 0;
String news = "News";
for (Element result : results) {
index++;
linkHref = result.attr("href");
String linkText = result.text();
String pingResult = index + ": " + linkText + ", URL:: " + linkHref.substring(6, linkHref.indexOf("&"));
if (pingResult.contains(news)) {
System.out.println("FOUND " + "\"" + linkText + "\"" + "NO HYPERTEXT FOR NEWS QUERY RESULTS AT THIS TIME. SKIPPED INDEX.");
System.out.println();
} else {
System.out.println(pingResult);
}
}
System.out.println();
System.out.println();
goToURL(linkHref, input);
}
public static int goToURL(String hRef, Scanner input) {
try {
System.out.print("Enter Index (i.e. 1, 2, etc) you wish to visit, 0 to exit: ");
int newIndex = input.nextInt();
for (int i = 0; i < index; i++) {
if (newIndex == index) {
/*
RUNNING LINUX COMMAND WITH RUNTIME CLASS TO COCANTENATE THE HYPERLINK SUBSTRING
*/
Process process = Runtime.getRuntime().exec("xdg-open " + hRef.substring(6, hRef.indexOf("&")));
process.waitFor();
break;
} else if (newIndex == 0) {
System.out.println("Shutting program down.");
System.exit(0);
}
}
} catch (Exception e) {
System.out.println("ERROR while parsing URL");
}
return index;
}
}
HERE IS THE OUTPUT It stops before the new Scanner can take input
Search: Oracle
Enter number of query results: 3
1: Oracle | Integrated Cloud Applications and Platform Services, URL:: =http://www.oracle.com/
2: Oracle Corporation - Wikipedia, the free encyclopedia, URL:: =https://en.wikipedia.org/wiki/Oracle_Corporation
3: Oracle (@Oracle) | Twitter, URL:: =https://twitter.com/oracle%3Flang%3Den
Enter Index (i.e. 1, 2, etc) you wish to visit, 0 to exit: Shutting program down.
Process finished with exit code 0
As you can see, it goes straight to the else statment to shut the program down. Any help would be greatly appreciated. This is a fun project, and I look forward to completing it.
Upvotes: 0
Views: 1615
Reputation: 1387
Per the suggestion of an SO team member, I asked why Scanner was not asking for input. Technically speaking, I fixed the problem with the program stopping BEFORE getting input. Though a problem still exists where it is not actually processing the input, the previous problem was fixed and here is my solution.
I did not close the original Scanner, and added the Scanner as a parameter to my "goToURL" method. I also removed an else statement that was closing the program, as the input to allow the program to keep running is still buggy. Nonetheless, here is the "working" code that at least solves the original problem.
Additionally, I placed the String elements (pingResult) into an ArrayList to improve the loop structure in the goToURL method. I felt this was a decent way to go about using a simple data structure for accessing elements:
import java.io.IOException;
import java.util.ArrayList;
import java.util.Scanner;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Document;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class GoogleSearchJava {
static int index;
static String linkHref;
public static final String GOOGLE_SEARCH_URL = "https://www.google.com/search";
public static void main(String[] args) throws IOException {
//GET INPUT FOR SEARCH TERM
Scanner input = new Scanner(System.in);
System.out.print("Search: ");
String searchTerm = input.nextLine();
System.out.print("Enter number of query results: ");
int num = input.nextInt();
String searchURL = GOOGLE_SEARCH_URL + "?q=" + searchTerm + "&num=" + num;
//NEED TO DEFINE USER AGENT TO PREVENT 403 ERROR.
Document document = Jsoup.connect(searchURL).userAgent("Mozilla/5.0").get();
//OPTION TO DISPLAY HTML FILE IN BROWSWER. DON'T KNOW YET.
//System.out.println(doc.html());
//If google search results HTML change the <h3 class="r" to <h3 class ="r1"
//need to change below stuff accordingly
Elements results = document.select("h3.r > a");
index = 0;
String news = "News";
/*
THIS WILL ADD THE pingResult STRINGS TO AN ARRAYLIST
*/
ArrayList<String> displayResults = new ArrayList<>();
for (Element result : results) {
index++;
linkHref = result.attr("href");
String linkText = result.text();
String pingResult = index + ": " + linkText + ", URL:: " + linkHref.substring(6, linkHref.indexOf("&")) + "\n";
if (pingResult.contains(news)) {
System.out.println("FOUND " + "\"" + linkText + "\"" + "NO HYPERTEXT FOR NEWS QUERY RESULTS AT THIS TIME. SKIPPED INDEX.");
System.out.println();
} else {
displayResults.add(pingResult);
}
}
for(String urlString : displayResults) {
System.out.println(urlString);
}
System.out.println();
System.out.println();
goToURL(linkHref, input, displayResults);
}
public static int goToURL(String hRef, Scanner input, ArrayList<String> resultList) {
try {
System.out.print("Enter Index (i.e. 1, 2, etc) you wish to visit, 0 to exit: ");
index = input.nextInt();
for (String string : resultList) {
if (string.startsWith(Integer.toString(index))) {
Process process = Runtime.getRuntime().exec("xdg-open " + hRef.substring(6, hRef.indexOf("&")));
process.waitFor();
}
}
} catch (Exception e) {
System.out.println("ERROR while parsing URL");
}
return index;
}
}
Upvotes: 1