pacomet
pacomet

Reputation: 5141

Awk, end of line in script

I'm just making my first attempts with awk and have one, maybe, simple question. I am trying to list a directory and extract some information from the listing based on a string. The bash script I'm trying is:

 ls *.hdf > temporary.list
 nom2=`awk 'BEGIN {FS = "." } ; { $1 ~ /'$year$month'/ } { print $2 }' temporary.list `
 file=$year$month.$nom2.hdf 
 file2=$year$month.hdf

where year and month change in a for loop (1981 to 1985 and 01 to 12). The temporary.list file is composed of 12 lines like:

198201.s04m1pfv51-bsst.hdf
198202.s04m1pfv51-bsst.hdf
198203.s04m1pfv51-bsst.hdf
198204.s04m1pfv51-bsst.hdf
198205.s04m1pfv51-bsst.hdf
198206.s04m1pfv51-bsst.hdf
198207.s04m1pfv51-bsst.hdf
198208.s04m1pfv51-bsst.hdf
198209.s04m1pfv51-bsst.hdf
198210.s04m1pfv51-bsst.hdf
198211.s04m1pfv51-bsst.hdf
198212.s04m1pfv51-bsst.hdf

I want to select files depending on year-month. The problem is that my awk sentence does not seem to get different lines as different registers, I suppose. The output of the script is:

nom2 = h s04m1pfv51-bsst h s04m1pfv51-bsst h s04m1pfv51-bsst h
s04m1pfv51-bsst h s04m1pfv51-bsst h s04m1pfv51-bsst h s04m1pfv51-bsst
s04m1pfv51-bsst s04m1pfv51-bsst s04m1pfv51-bsst s04m1pfv51-bsst
s04m1pfv51-bsst 

file = 198201.h s04m1pfv51-bsst h s04m1pfv51-bsst h
s04m1pfv51-bsst h s04m1pfv51-bsst h s04m1pfv51-bsst h s04m1pfv51-bsst
h s04m1pfv51-bsst s04m1pfv51-bsst s04m1pfv51-bsst s04m1pfv51-bsst
s04m1pfv51-bsst s04m1pfv51-bsst.hdf 

file2= 198201.hdf

Maybe is some simple syntax error, any help would be appreciated.

Thanks

Upvotes: 0

Views: 787

Answers (2)

ghoti
ghoti

Reputation: 46856

It's a bad habit to parse lists of files the way you're doing it in your bash script, since it's incompatible with a number of special characters that might appear in a filename. Like rules of grammar, you should break the rules only if you know them well. :) A for loop is a better construct for handling files:

#!/bin/bash

year=1982
month=9

for filename in $(printf "%04d%02d" "$year" "$month").*.hdf; do
  nom2=${filename#*.}
  nom2=${nom2%.*}
  file2=${filename%%.*}.hdf
  printf "file=%s\nnom2=%s\nfile2=%s\n\n" "$filename" "$nom2" "$file2"
done

Is that what you're looking for? Note that parameter expansion using % and # works in traditional bourne shell as well as bash, so it's extremely portable.

If you REALLY want to use awk, you've still got lots of options.

#!/bin/bash

year=1982
month=9

for filename in $(printf "%04d%02d" "$year" "$month").*.hdf; do
  nom2=$(awk -vym="^$year$month." -vf="$filename" 'BEGIN{if(f~ym){sub(/\..*/,"",f);print f}}')
  file="$nom2.hdf"
  printf "file=%s\nnom2=%s\nfile2=%s\n\n" "$filename" "$nom2" "$file2"
done

Note that using printf to format the date allows you to handle single-digit months with a leading zero, with minimal effort.

Upvotes: 1

c00kiemon5ter
c00kiemon5ter

Reputation: 17654

You need to give awk the variables you need it to know about.
To pass a variable to awk, use -v for each one.

awk -v y="$year" -v m="$month" 'BEGIN { FS = "." } $1 == y m { print $2 }' file

awk vars can then be used directly, no $ needed.
as with print the space between them will be ignored, a real space would have to be quoted. So the way it is now, it checks if the first field ($1) exactly matches (==) 'y m' which is expanded to '${year}${month}'. If the match happens then the 2nd field ($2) is printed.


keep in mind that awk logic blocks are in the form

condition { action [; action ..] }

note no curly braces around condition
you also don't need ; between blocks, only between actions, but they don't hurt either.
so, { $1 ~ /'$year$month'/ } will do nothing the way it is written.


having said all that, I would go with pure Bash for what you're doing:

while IFS='.' read -r ym f e
do 
    printf '%8s: %s\n' "year"  "${ym%??}"   \
                       "month" "${ym#????}" \
                       "file"  "$f"         \
                       "ext"   "$e"
done < file

Upvotes: 1

Related Questions