Reputation: 121

Downloading images from AWS gives me corrupt files

I'm trying to download images that are hosted on Amazon Web Services. My methods work fine on any other host, but downloading an image off this url for example http://s3-eu-west-1.amazonaws.com/static.melkweg.nl/uploads/images/scaled/event_header/18226 is giving me trouble. It does download, but the file is only 49kb big and cannot be opened.

I've tried different methods such as Apache's FileUtils copyURLToFile, BufferedInputStream, ImageIO, etc. Some throw errors, most just download a corrupt file.

Here are the methods I've tried:

public static void downloadApache(String imageurl, String target)
{
    try
    {
    File file = new File(target);
    URL url = new URL(imageurl);
    FileUtils.copyURLToFile(url, file);
    }
    catch(Exception e)
    {
        System.err.println("[3]Something went wrong.");
    }
}

public static void downloadImage(String imageurl, String name)
{
    try
    {
        URL url = new URL(imageurl);
        InputStream in = new BufferedInputStream(url.openStream());
        OutputStream out = new BufferedOutputStream(new FileOutputStream(name));

        for ( int i; (i = in.read()) != -1; ) {
            out.write(i);
        }
        in.close();
        out.close();
    }
    catch(Exception e)
    {
        e.printStackTrace();
        System.err.println("[0]Something went wrong.");
    }
}

public static void downloadImageIO(String imageurl, String target)
{
    try
    {
        URL url = new URL(imageurl);    
        BufferedImage image = ImageIO.read(url);
        ImageIO.write(image, "jpg", new File(target));
    }
    catch(Exception e)
    {
        e.printStackTrace();
        System.err.println("[1]Something went wrong.");
    }
}

public static void downloadImageCopy(String imageurl, String target)
{
    try
    {
        try (InputStream in = new URL(imageurl).openStream()) {
            Files.copy(in, Paths.get(target), StandardCopyOption.REPLACE_EXISTING);
        }
    }
    catch(Exception e)
    {
        e.printStackTrace();
        System.err.println("[2]Something went wrong.");
    }
}

And here's the main method if that is of any interest

public static void main(String[] args)
{
    String imageurl = "http://s3-eu-west-1.amazonaws.com/static.melkweg.nl/uploads/images/scaled/event_header/18226";
    String name = "downloaded_image.jpg";
    String target = "C:/Users/Robotic/Downloads/" + name;
    Download.downloadImage(imageurl, name);
    Download.downloadImageCopy(imageurl, target);
    Download.downloadImageIO(imageurl, target);
    Download.downloadApache(imageurl, target);
}

Thanks in advance.

Upvotes: 1

Answers (2)

Arafat Nalkhande

Reputation: 11718

As pointed out in the earlier answer, it is in gzip format. You can use the following method and get the file unzipped

public static void downloadApache(String imageurl, String target) {
    try {
        File file = new File(target+".gzip");
        URL url = new URL(imageurl);
        FileUtils.copyURLToFile(url, file);

        byte[] buffer = new byte[1024];
        try {
            java.util.zip.GZIPInputStream gzis = new java.util.zip.GZIPInputStream(new FileInputStream(file));
            FileOutputStream out = new FileOutputStream(target);
            int len;
            while ((len = gzis.read(buffer)) > 0) {
                out.write(buffer, 0, len);
            }
            gzis.close();
            out.close();
        } catch (IOException ex) {
            ex.printStackTrace();
        }

    } catch (Exception e) {
        System.err.println("[3]Something went wrong.");
    }
}

Upvotes: 0

Abhinav Upadhyay

Reputation: 2585

The file that you are getting from S3 is gzip compressed, you need to decompress it before trying to read it.

$ wget http://s3-eu-west-1.amazonaws.com/static.melkweg.nl/uploads/images/scaled/event_header/18226
$ file 18226                        
18226: gzip compressed data, from Unix

Upvotes: 1

Downloading images from AWS gives me corrupt files

Answers (2)

Related Questions