herrherr
herrherr

Reputation: 698

Rename millions of files on Linux

I need to rename about 2 million images. The files look like this image.jpg?arg=value and need to be renamed to image.jpg without the arguments.

Here is what I'm currently doing:

sudo find . -name "*.jpg?*" -exec rename 's/(\?.*)//' {} \;

This gets the job done but seems to take forever. Does anyone have a suggestion on how to speed this up?

Upvotes: 1

Views: 929

Answers (3)

dat789
dat789

Reputation: 2073

I tried this on Ubuntu 14.04 but it does not work. The command executed successfully but nothing happened. I figured that the rename regex part is not right. To check this:

$ echo Screenshot_from_2015-08-17_122834.png.de4Mzv2 | sed 's/(\?.*)//'
Screenshot_from_2015-08-17_122834.png.de4Mzv2

But changing the regex to the following works. $ echo Screenshot_from_2015-08-17_122834.png.de4Mzv2 | sed 's/.[^.]*$//' Screenshot_from_2015-08-17_122834.png

Using that in the command suggested by @realspirituals, I have the following files:

$ ls -ltr
Screenshot_from_2015-08-19_114601.png.somegthingy 
Screenshot_from_2015-08-17_122834.png.de4Mzv2 
Screenshot_from_2015-08-17_122455.png.ac84Lk1
Screenshot_from_2015-08-13_154012.png.uNl34sH 
Screenshot_from_2015-08-13_101459.png.53rv1ce 
Screenshot_from_2015-08-13_101437.png.l4Pt0pz 
Screenshot_from_2015-08-13_101230.png.p31Ic4n

$ sudo find . -name "*.png*" -type f -print0 | xargs -0 -I {} -P4 -n1 rename 's/\.[^\.]*$//' {} \;
Screenshot_from_2015-08-19_114601.png 
Screenshot_from_2015-08-17_122834.png 
Screenshot_from_2015-08-17_122455.png
Screenshot_from_2015-08-13_154012.png 
Screenshot_from_2015-08-13_101459.png 
Screenshot_from_2015-08-13_101437.png 
Screenshot_from_2015-08-13_101230.png

Upvotes: 0

Srini V
Srini V

Reputation: 11355

Can you try

sudo find . -name "*.jpg*" -print0 | xargs -0 -I '{}' -P4 -n1 rename 's/(\?.*)//' {} \;

From the man page of xargs

   --max-procs=max-procs
   -P max-procs
          Run  up  to max-procs processes at a time; the default is 1.  If
          max-procs is 0, xargs will run as many processes as possible  at
          a  time.   Use the -n option with -P; otherwise chances are that
          only one exec will be done.

Here I am limiting the max child process to 4. If you want more then mark -P0 which will take max possible child, but remember, your CPU will be heavily overloaded.

OR

use gnu parallel

Upvotes: 5

jerik
jerik

Reputation: 5767

parallelize the renaming. Start two (or three, four) shells and run the command. Be sure that you seperate somehow the images for the commands, so that not 2 commands are run on the same images.

Upvotes: 2

Related Questions