Reputation: 57
I have a list of files from the beginning of June till today that have arbitrary names with the date in them.
They are:
JUNE 13.pdf
JUNE 16 MIDTERM REVIEW LAND GRABS 1 EXCLUSION AND BACKGROUND,.pdf
LECTURE JUNE 21 RUBBER RIGHTS RESISTANCE.pdf
SUMMER JUNE 15 TITLING AND LEGIBILITY.pdf
SUMMER JUNE 22 LIVELIHOODS.pdf
summer JUN 14 PROPERTY RIGHTS LECTURE.pdf
I have been trying for quite a while today to get the hang of bash but in all the solutions I've looked at have really complicated syntax that I don't understand. I know you can extract the date from the file and name it accordingly, but I really want to learn how to extract substrings using bash.
The best solution I've come up with has been found from here
for j in *.pdf;
do
length=`expr match "$j" 'JUN[E, ] ??'`
ind=`expr index "$j" 'JUN' `
mv $j ${j:ind:length}
done
but this solution just gives me many syntax errors without any indication of what I'm doing wrong. I've also tried using cut, rename, some other "standard" stuff but I'm really struggling.
Upvotes: 0
Views: 496
Reputation: 212218
expr
is useful, but probably not the easiest way to do this. (The match
command will only match at the start of the string.) I am assuming that you are getting the date to match a directory: change the test as needed if that is an incorrect assumption. Try sed:
for i in *.pdf; do
d=$(echo "$i" | sed -E 's/.*(JUNE? [0-9]+).*/\1/');
test -d "$d" && mv -i "$i" "$d"
done
Upvotes: 1
Reputation: 249123
I'd use Python:
#!/usr/bin/env python
import os
import re
import sys
regex = re.compile('.*(JAN|FEB|MAR|APR|MAY|JUNE?|JULY?|AUG|SEP|OCT|NOV|DEC) (\d+).*(\.\w+)$')
for filename in sys.argv[1:]:
matches = regex.match(filename)
if matches:
newname = ''.join(matches.groups())
print "rename -", filename, "- to -", newname
#os.rename(filename, newname)
Uncomment the last line once you've tested it and seen the printed results are what you want.
Upvotes: 1