trombonebraveheart
trombonebraveheart

Reputation: 81

How to clean up multiple file names using bash?

I have. directory with ~250 .txt files in it. Each of these files has a title like this:

Abraham Lincoln [December 01, 1862].txt

George Washington [October 25, 1790].txt

etc...

However, these are terrible file names for reading into python and I want to iterate over all of them to change them to a more suitable format.

I've tried similar things for changing single variables that are shared across many files. But I can't wrap my head around how I should iterate over these files and change the formatting of their names while still keeping the same information.

The ideal output would be something like

1861_12_01_abraham_lincoln.txt

1790_10_25_george_washington.txt

etc...

Upvotes: 1

Views: 454

Answers (3)

Anubis
Anubis

Reputation: 7435

for file in *.txt; do
    # extract parts of the filename to be differently formatted with a regex match
    [[ $file =~ (.*)\[(.*)\] ]] || { echo "invalid file $file"; exit; }

    # format extracted strings and generate the new filename
    formatted_date=$(date -d "${BASH_REMATCH[2]}" +"%Y_%m_%d")
    name="${BASH_REMATCH[1]// /_}"  # replace spaces in the name with underscores
    f="${formatted_date}_${name,,}" # convert name to lower-case and append it to date string
    new_filename="${f::-1}.txt"     # remove trailing underscore and add `.txt` extension

    # do what you need here
    echo $new_filename
    # mv $file $new_filename
done 

Upvotes: 1

tshiono
tshiono

Reputation: 22012

Please try the straightforward (tedious) bash script:

#!/bin/bash

declare -A map=(["January"]="01" ["February"]="02" ["March"]="03" ["April"]="04" ["May"]="05" ["June"]="06" ["July"]="07" ["August"]="08" ["September"]="09" ["October"]="10" ["November"]="11" ["December"]="12")

pat='^([^[]+) \[([A-Za-z]+) ([0-9]+), ([0-9]+)]\.txt$'
for i in *.txt; do
    if [[ $i =~ $pat ]]; then
        newname="$(printf "%s_%s_%s_%s.txt" "${BASH_REMATCH[4]}" "${map["${BASH_REMATCH[2]}"]}"  "${BASH_REMATCH[3]}" "$(tr 'A-Z ' 'a-z_' <<< "${BASH_REMATCH[1]}")")"
        mv -- "$i" "$newname"
    fi
done

Upvotes: 1

Kingsley
Kingsley

Reputation: 14906

I like to pull the filename apart, then put it back together.

Also GNU date can parse-out the time, which is simpler than using sed or a big case statement to convert "October" to "10".

#! /usr/bin/bash

if [ "$1" == "" ] || [ "$1" == "--help" ]; then
    echo "Give a filename like \"Abraham Lincoln [December 01, 1862].txt\" as an argument"
    exit 2
fi

filename="$1"

# remove the brackets
filename=`echo "$filename" | sed -e 's/[\[]//g;s/\]//g'`

# cut out the name
namepart=`echo "$filename" | awk '{ print $1" "$2 }'`

# cut out the date
datepart=`echo "$filename" | awk '{ print $3" "$4" "$5 }' | sed -e 's/\.txt//'`

# format up the date (relies on GNU date)
datepart=`date --date="$datepart" +"%Y_%m_%d"`

# put it back together with underscores, in lower case
final=`echo "$namepart $datepart.txt" | tr '[A-Z]' '[a-z]' | sed -e 's/ /_/g'`

echo mv \"$1\" \"$final\"

EDIT: converted to BASH, from Bourne shell.

Upvotes: 0

Related Questions