Reputation: 68
I have this script that needs to READ ALL fields in a coulmn and validate before it can hit the second column for example
Name, City
Joe, Orlando
Sam, Copper Town
Mike, Atlanta
so the script should read the entire column of name(top to bottom) and validate for null before it moves to the second column. IT should NOT read line by line . Please add some pointer on how to modify /correct
# Read all files. no file have spaces in their names
for file in /export/home/*.csv ; do
# init two variables before processing a new file
$date_regex = '~(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)\d\d~';
FILESTATUS=GOOD
FIRSTROW=true
# process file 1 line a time, splitting the line by the
# Internal Field Sep ,
cat "${file}" | while IFS=, read field1 field2 field3 field4; do
# Skip first line, the header row
if [ "${FIRSTROW}" = "true" ]; then
FIRSTROW=FALSE
# skip processing of this line, continue with next record
continue;
fi
#different validations
if [[ ! -n "$field1" ]]; then
${FILESTATUS}=BAD
# Stop inner loop
break
fi
#somecheckonField2
if [[ ! -n "$field2"]] && ("$field2" =~ $date_regex) ; then
${FILESTATUS}=BAD
# Stop inner loop
break
fi
if [[ ! -n "$field3" ]] && (("$field3" != "S") || ("$field3" != "E")); then
${FILESTATUS}=BAD
# Stop inner loop
break
fi
if [[ ! -n "$field4" ]] || (( ${#field4} < 9 || ${#field4} > 11 )); then
${FILESTATUS}=BAD
# Stop inner loop
break
fi
done
if [ ${FILESTATUS} = "GOOD" ] ; then
mv ${file} /export/home/goodFile
else
mv ${file} /export/home/badFile
fi
done
Upvotes: 0
Views: 75
Reputation: 246942
This awk will read the whole file, then you can do your verification in the END block:
for file in /export/home/*.csv ; do
awk -F', ' '
# skip the header and blank lines
NR == 1 || NF == 0 {next}
# save the data
{ for (i=1; i<=NF; i++) data[++nr,i] = $i }
END {
status = "OK"
# verify column 1
for (lineno=1; lineno <= nr; lineno++) {
if (length(data[lineno,1]) == 0) {
status = "BAD"
break
}
}
printf "file: %s, verify column 1, status: %s\n", FILENAME, status
# verify other columns ...
}
' "$file"
done
Upvotes: 1
Reputation: 62399
Here's an attempt at an awk
script that does what it seems like the original script is trying to do:
#!/usr/bin/awk -f
# fields separated by commas
BEGIN { FS = "," }
# skip first line
NR == 1 { next }
# check for empty fields
$1 == "" || $2 == "" || $3 == "" || $4 == "" { exit 1 }
# check for "valid" date (urk... doing this with a regex is horrid)
# it would be better to split it into components and validate each sub-field,
# but I'll leave that as a learning exercise for the reader
$2 !~ /^(0[1-9]|1[012])[- /.](0[1-9]|[12][0-9]|3[01])[- /.](19|20)[0-9][0-9]$/ { exit 1 }
# third field should be either S or E
$3 !~ /^[SE]$/ { exit 1 }
# check the length of the fourth field is between 9 and 11
length($4) < 9 || length($4) > 11 { exit 1 }
# if we haven't found problems up to here, then things are good
END { exit 0 }
Save that in e.g. validate.awk
, and set the executable bit on it (chmod +x validate.awk
), then you can simply do:
if validate.awk < somefile.txt
then
mv somefile.txt goodfiles/
else
mv somefile.txt badfiles/
fi
Upvotes: 1