Daniel Vaughn
Daniel Vaughn

Reputation: 23

How to read url with ~300 MB of json text

I'm trying to read the text from https://mtgjson.com/api/v5/AllPrintings.json. I have tried with this code:

url = new URL("https://mtgjson.com/api/v5/AllPrintings.json");
conn = (HttpsURLConnection) url.openConnection();

BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream())); // error here

String inputLine;
StringBuffer content = new StringBuffer();
while ((inputLine = in.readLine()) != null) {
    content.append(inputLine);
}
System.out.println(content);

I keep getting IOException with the BufferedReader (conn.getInputStream()). The text from the url does not contain a new line character. How can I read this data?

(Edit)
I'm using Java 1.8 with Apache NetBeans 16. I'm sticking with 1.8 so I can also use Eclipse Neon3.

Error:

java.io.IOException: Server returned HTTP response code: 403 for URL: https://mtgjson.com/api/v5/AllPrintings.json
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1894)
    at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1492)
    at sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:263)
    at tests.MtgJson.main(MtgJson.java:44)

I've also been trying ProcessBuilder with curl and it's giving better results but curl stops after about a minute. Curl continues if I terminate the program inside Netbeans but doesn't always finish creating the file contents. I shouldn't have to stop my program for curl to continue. Is there something I'm missing for curl to work?

String command = "curl --keepalive-time 5 https://mtgjson.com/api/v5/AllPrintings.json";
ProcessBuilder pb = new ProcessBuilder(command.split(" "));
pb.redirectOutput(new File("AllPrintings.json"));
Process process = pb.start();
// use while() or process.waitfor();
while(process.isAlive())
    Thread.sleep(1000);
process.destroy();

Answer (since I can't post one):

String command = "curl https://mtgjson.com/api/v5/AllPrintings.json";
ProcessBuilder pb = new ProcessBuilder(command.split(" "));

pb.inheritIO(); // keep the program from hanging
pb.redirectOutput(new File("AllPrintings.json"));

Process process = pb.start();
process.waitFor(); // waiting for the process to terminate.

The complete file is created without hanging then the program will close. Curl outputs info to the console and must be consumed (found here).

Upvotes: 0

Views: 201

Answers (1)

DuncG
DuncG

Reputation: 15186

There is no need to use byte->character conversion with BufferedReader just to make a copy. Instead copy the content directly to a file using Java NIO Files.copy, and then use the output file for any further processing:

Path file = Path.of("big.json"); 
// Older JDK use Paths.get("filename")
Files.copy(conn.getInputStream(), file);
System.out.println("Saved "+Files.size(file)+" bytes to "+file);

Which should print:

Saved 313144388 bytes to big.json

Upvotes: 0

Related Questions