Reputation: 89
I am making a bash script that will take a txt file as input, delete all lines containing dash ("-") or any integer (anywhere in the line) from it and parse it to a new file.
I tried multiple ways but I had 0 success.
I'm stuck trying to figure out correct regex for "delete all lines containing number OR dash" since I can't make it work.
Here's my code:
wget -q awsfile1.csv.zip # downloads file
unzip "awsfile1".zip # unzips it
cut -d, -f 2 file1.csv > file2.csv # cuts it
sort file2.csv > file2.txt # translates csv into text
printf "Removing lines containing numbers.\n" # prints output
sed 's/[0-9][0-9]*/Number/g' file2.txt > file2-b.txt # doesn't do anything, file is empty on the output
Thanks.
Upvotes: 0
Views: 294
Reputation: 58420
This might work for you (GNU sed):
sed -E 'h;s/\S+/\n&\n/2;/\n.*[-0-9].*\n/d;x' file
Copy the current line, isolate the 2nd field and delete the line if it contains the required strings, otherwise revert to the original line.
N.B. This prints the original line, if you only want the 2nd field, use:
sed -E 's/\S+/\n&\n/2;s/.*\n(.*)\n.*/\1/;/[-0-9]/d' file
Upvotes: 1
Reputation: 67497
you can combine cut and filter into an awk
script and sort after
... get and unzip file
$ awk -F, '$2!~/[-0-9]/{print $2}' file | sort
print field 2 if it doesn't contain any digits or hyphen.
Upvotes: 2