How can I losslessly crop a jpeg in R

Question

I am new to R. I have a folder full of images(RGB) which are not of the same dimensions. My requirement is to have them all in the same dimensions which would involve resizing a bunch of them. I wrote the following code to get this done

#EBImage
library(EBImage)
path = "G:/Images/"
file.names = dir(path,full.names = TRUE, pattern =".jpeg")
reqd_dim = c(3099,2329,3)
sprintf("Number of Image Files is: %d", length(file.names))

for(i in 1:length(file.names)){
  correction_flag = FALSE
  print("Loop Number:")
  flush.console()
  print(i)
  flush.console()
  img = readImage(file.names[i])
  # Checking if the dimensions are the same
  for (j in 1:length(reqd_dim)) {
    if(dim(img)[j]!=reqd_dim[j]){
      correction_flag = TRUE
      break
    }
  }
  if(correction_flag==TRUE){
    print("Correcting dimensions of the image")
    flush.console()
    writeImage(img[1:3099, 1:2329, 1:3],file.names[i],quality = 100)
  }
}

My problem is that while the images are originally between 500-600 kb in size, the ones that are resized end up being between 1.8 to 2 Mb. In my particular case the images are in either of the two sizes - 3100x2329 or 3099x2329. So my resizing involves removing the extra column of pixels to make all images 3099x2329. I am ok with the file size of the files going down a bit as I expect some information to be lost; but in my case the file size is increasing more than three-fold. Alternatively I have thought of converting the images into matrices(which is supported by EBImage) and remove the extra row. But I have two issues here, one is that I don't know how to do it and two is even if I found a way to do it, I'm afraid I might loose some information if I ever needed to convert it back to an image. I'm open to an improvement over this approach, or a totally different one as well. My only requirement is that I need to be able to do resize my images in R without adding or losing any information (apart from the information in the pixels to be removed)

aoles · Accepted Answer

To perform lossless JPEG cropping you can use jpegtran, an external command line tool distributed as part of the IJG library. For example, the following command removes the last column of pixels from a 768x512 image:

jpegtran -crop 767x512+0+0 -optimize image.jpg >image.jpg

The -crop switch specifies the rectangular subarea WxH+X+Y, and -optimize is an option for reducing file size without quality loss by optimizing the Huffman table. For a complete list of switches see jpegtran -help.

Once jpegtran is installed on your system, it can be invoked from R by system(). The following example first takes a sample image and saves it as JPEG. The image is then cropped, and the pixel values are compared to the values from the original image.

library("EBImage")

# resave a sample image as JPG
f = system.file("images", "sample.png", package="EBImage")
writeImage(readImage(f), "image.jpg", quality=90)

# do the cropping
system("jpegtran -crop 767x512+0+0 -optimize image.jpg >cropped.jpg")

# compare file size
file.size("image.jpg", "cropped.jpg")
## [1] 65880 65005

original = readImage("image.jpg")
dim(original)
## [1] 768 512

cropped  = readImage("cropped.jpg")
dim(cropped)
## [1] 767 512

# check whether original values are retained
identical(original[1:767,], cropped)
## TRUE

Back to your specific use-case: your script could be further improved by examining image dimensions without actually loading the whole pixel array into R. For this you could, for example, use RBioFormats to only read image meatadata containing image dimensions into R. But you can also use another command line tool identify distributed as part of the ImageMagick suite to retrieve the image dimensions, as illustrated below.

path = "G:/Images/"
file.names = dir(path, full.names = TRUE, pattern =".jpeg")
reqd_dim = c(3099,2329,3)
cat(sprintf("Number of Image Files is: %d
", length(file.names)))

for (i in seq_along(file.names)) {
  file = file.names[i]
  cat(sprintf("Checking dimensions of image number %d: ", i))
  flush.console()

  cmd = paste('identify -format "c(%w, %h)"', file)
  res = eval(parse(text=system(cmd, intern=TRUE)))

  # Checking if the dimensions are the same
  if ( all(res==reqd_dim) ) {
    cat("OK
")
    flush.console()
  }
  else {
    cat("Correcting
")
    flush.console()
    system(sprintf("jpegtran -crop %dx%d+0+0 -optimize %s >%s", 
                   reqd_dim[1], reqd_dim[2], file, file))
  }
}

How can I losslessly crop a jpeg in R

Answers (1)

Related Questions