Awesim
Awesim

Reputation: 89

Sed multiple conditions regex matching

I am making a bash script that will take a txt file as input, delete all lines containing dash ("-") or any integer (anywhere in the line) from it and parse it to a new file.

I tried multiple ways but I had 0 success.

I'm stuck trying to figure out correct regex for "delete all lines containing number OR dash" since I can't make it work.

Here's my code:

wget -q awsfile1.csv.zip                      # downloads file
unzip "awsfile1".zip                          # unzips it
cut -d, -f 2 file1.csv > file2.csv            # cuts it
sort file2.csv > file2.txt                    # translates csv into text
printf "Removing lines containing numbers.\n" # prints output
sed 's/[0-9][0-9]*/Number/g'  file2.txt > file2-b.txt  # doesn't do anything, file is empty on the output

Thanks.

Upvotes: 0

Views: 294

Answers (2)

potong
potong

Reputation: 58420

This might work for you (GNU sed):

sed -E 'h;s/\S+/\n&\n/2;/\n.*[-0-9].*\n/d;x' file

Copy the current line, isolate the 2nd field and delete the line if it contains the required strings, otherwise revert to the original line.

N.B. This prints the original line, if you only want the 2nd field, use:

sed -E 's/\S+/\n&\n/2;s/.*\n(.*)\n.*/\1/;/[-0-9]/d' file

Upvotes: 1

karakfa
karakfa

Reputation: 67497

you can combine cut and filter into an awk script and sort after

... get and unzip file
$ awk -F, '$2!~/[-0-9]/{print $2}' file | sort

print field 2 if it doesn't contain any digits or hyphen.

Upvotes: 2

Related Questions