Malyk
Malyk

Reputation: 193

Extract part of a file name in bash

I have a folder with lots of files having a pattern, which is some string followed by a date and time:

BOS_CRM_SUS_20130101_10-00-10.csv (3 strings before date)
SEL_DMD_20141224_10-00-11.csv (2 strings before date)
SEL_DMD_SOUS_20141224_10-00-10.csv (3 strings before date)

I want to loop through the folder and extract only the part before the date and output into a file.

Output
BOS_CRM_SUS_
SEL_DMD_
SEL_DMD_SOUS_

This is my script but it is not working

#!/bin/bash

# script variables
FOLDER=/app/list/l088app5304d1/socles/Data/LEMREC/infa_shared/Shell/Check_Header_T24/

LOG_FILE=/app/list/l088app5304d1/socles/Data/LEMREC/infa_shared/Shell/Check_Header_T24/log

echo "Starting the programme at:  $(date)" >> $LOG_FILE

# Getting part of the file name from FOLDER
for file in `ls $FOLDER/*.csv`
do
    mv "${file}" "${file/date +%Y%m%d HH:MM:SS}" 2>&1 | tee -a $LOG_FILE
done #> $LOG_FILE

Upvotes: 1

Views: 6637

Answers (5)

Hackaholic
Hackaholic

Reputation: 19733

using grep:

ls *.csv | grep -Po "\K^([A-Za-z]+_)+"

output:

BOS_CRM_SUS_
SEL_DMD_
SEL_DMD_SOUS_

Upvotes: 1

Barmar
Barmar

Reputation: 780663

When you use ${var/pattern/replace}, the pattern must be a filename glob, not command to execute.

Instead of using the substitution operator, use the pattern removal operator

mv "${file}" "${file%_*-*-*.csv}.csv"

% finds the shortest match of the pattern at the end of the variable, so this pattern will just match the date and time part of the filename.

Upvotes: 3

rici
rici

Reputation: 241671

The substitution:

"${file/date +%Y%m%d HH:MM:SS}"

is unlikely to do anything, because it doesn't execute date +%Y%m%d HH:MM:SS. It just treats it as a pattern to search for, and it's not going to be found.

If you did execute the command, though, you would get the current date and time, which is also (apparently) not what you find in the filename.

If that pattern is precise, then you can do the following:

echo "${file%????????_??-??-??.csv}" >> "$LOG_FILE"

Upvotes: 2

fredtantini
fredtantini

Reputation: 16556

Assuming you wont have numbers in the first part, you could use:

$ for i in *csv;do  str=$(echo $i|sed -r 's/[0-9]+.*//'); echo $str; done
BOS_CRM_SUS_
SEL_DMD_
SEL_DMD_SOUS_

Or with parameter substitution:

$ for i in *csv;do echo ${i%_*_*}_; done
BOS_CRM_SUS_
SEL_DMD_
SEL_DMD_SOUS_

Upvotes: 3

Pradhan
Pradhan

Reputation: 16737

Use sed with extended-regex and groups to achieve this.

cat filelist | sed -r 's/(.*)[0-9]{8}_[0-9][0-9]-[0-9][0-9].[0-9][0-9].csv/\1/'

where filelist is a file with all the names you care about. Of course, this is just a placeholder because I don't know how you are going to list all eligible files. If a glob will do, for example, you can do

ls mydir/*.csv | sed -r 's/(.*)[0-9]{8}_[0-9][0-9]-[0-9][0-9].[0-9][0-9].csv/\1/'

Upvotes: 3

Related Questions