zahir hussain
zahir hussain

Reputation: 3739

read the url content

I want to read the url content by bytes. I have to read the 64 kb from the content of url.

public void readUrlBytes(String address) {
    StringBuilder builder = null;
    BufferedInputStream input = null;
    byte[] buffer = new byte[1024];
    int i = 0;
    try {
        URL url = new URL(address);
        URLConnection urlc = url.openConnection();
        input = new BufferedInputStream(urlc.getInputStream());
        int bytesRead;
        while ((bytesRead = input.read(buffer)) != -1) {
            builder.append(bytesRead);
            if (i==64) {
                break;
            }
            i++;
        }
        System.out.println(builder.toString());
    } catch (IOException l_exception) {
        //handle or throw this
    } finally {
        if (input != null) {
            try {
                input.close();
            } catch(IOException igored) {}
        }
    }

}

The above coding is for read character wise.

I need to read bytes.

Upvotes: 0

Views: 1296

Answers (6)

Bozho
Bozho

Reputation: 597422

If you remove the cast to char, you have a byte.

If you're going to store the whole content into memory, you can use ByteArrayOutputStream and write each byte to it. Finally call toByteArray() to obtain the array of bytes:

ByteArrayOutputStream baos = new ByteArrayOutputStream();
while ((byteRead = buffer.read()) != -1) {
    baos.write(byteRead);
}

byte[] result = baos.toByteArray();

Update: you mentioned you want only 64 kb. To achieve that just check whether baos.size() has reached 64*1024 and break

Upvotes: 0

BalusC
BalusC

Reputation: 1109875

You want to get the first 64KB from an URL into a byte[]?

That's easy:

public byte[] getFirst64KbFromUrl(String address) throws IOException {
    InputStream input = null;
    byte[] first64kb = new byte[64 * 1024];
    try {
        input = new URL(address).openStream();
        input.read(first64kb);
    } finally {
        if (input != null) try { input.close(); } catch(IOException ignore) {}
    }
    return first64kb;
}

If you actually have a problem with converting those bytes to String, here's how you could do it:

String string = new String(first64kb);

This however takes the platform default encoding into account. You'd like to use the server-side specified encoding for this which is available in the Content-Type response header.

URLConnection connection = new URL(address).openConnection();
// ...
String contentType = connection.getHeaderField("Content-Type");
String charset = "UTF-8"; // Let's default it to UTF-8.
for (String param : contentType.replace(" ", "").split(";")) {
    if (param.startsWith("charset=")) {
        charset = param.split("=", 2)[1];
        break;
    }
}
// ...
String string = new String(first64kb, charset);

See also:

Upvotes: 0

JTeagle
JTeagle

Reputation: 2196

I'm adding a separate answer as I suddenly realised another way the question could be interpreted: I think the OP wants to convert a stream of bytes representing the internal format of characters in a specific character set into the corresponding characters. For example, converting ASCII codes into ASCII characters.

This isn't a complete answer, but hopefully will put the OP on the right track if I've understood correctly. I'm using utf-8 as an example here:

BufferedInputStream istream = new BufferedInputStream(urlc.getInputStream() ); 
int numBytesAvailable = istream.available(); 
byte[] buffer = new byte[numBytesAvailable]; 
istream.read(buffer); 

ByteBuffer tempBuffer = ByteBuffer.wrap(buffer); 
Charset utf8Chars = Charset.forName("UTF-8"); 
CharBuffer chars = utf8Chars.decode(tempBuffer); 

Now you have a buffer of chars as Java sees them (you can use chars.array() to get a char[] out of it), so they can be printed as a string.

WARNING: You will need to get the entire stream into a byte buffer before trying to decode; decoding a buffer when you don't know the correct end of the character's internal byte sequence will result in corrupt characters!

Upvotes: 0

Greg Case
Greg Case

Reputation: 3240

Like Bozho said, you already are reading in bytes. However, it's probably more efficient to read everything into a byte array rather than doing it one byte at a time.

BufferedInputStream input = null;
  byte[] buffer = new byte[4096];
  try {
     URLConnection urlc = url.openConnection();
     input=  new BufferedInputStream( urlc.getInputStream() );
     int bytesRead;
     while( ( bytesRead = input.read(buffer) ) != -1 )
     {
       //do something with the bytes, array has data 0 to bytesRead (exclusive)
     }
  }
  catch( IOException l_exception ) {
       //handle or throw this
  }
  finally {
     if (input != null) {
        try {
          input.close();
        }
        catch(IOException igored) {}
     }
  }

Upvotes: 1

ZZ Coder
ZZ Coder

Reputation: 75496

This is how I did it,

                    input = urlc.getInputStream();
                    byte[] buffer = new byte[4096];
                    int n = - 1;

                    ByteArrayOutputStream baos = new ByteArrayOutputStream(4096);

                    while ( (n = input.read(buffer)) != -1)
                    {
                            if (n > 0)
                            {
                                    baos.write(buffer, 0, n);
                            }
                    }
                    byte[] bytes = baos.toByteArray();

Upvotes: 0

JTeagle
JTeagle

Reputation: 2196

You can simply read directly from the InputStream object returned:

  InputStream istream = urlc.getInputStream(); 

  int byteRead; 
  while ((byteRead = istream.read()) != -1) 
    builder.append(byteRead); 

  istream.close(); 

Upvotes: 0

Related Questions