Sigularity
Sigularity

Reputation: 967

How to remove space and the specific character in string - awk

Below is a input.

!{ID=34, ID2=35}
> 
!{ID=99, ID2=23}
> 
!{ID=18, ID2=87}
< 

I am trying to make a final result like as following. That is, wanted to remove space,'{' and '}' character and check if the next line is '>' or '<'. In fact, the input above is repeated. I also need to parse '>' and '<' character so I will put the parsed string(YES or NO) into database.

ID=34,ID=35#YES#NO
ID=99,ID=23#YES#NO
ID=18,ID=87#NO#YES

So, with 'sub' function I thought I can replace the space with blank but the result shows:

1#YES#NO

Can you let me know what is wrong? If possible, teach me how to remove '{' and '}' as well. Appreciated if you could show me the awk file version instead of one-liner.

BEGIN {
VALUES       = ""    
L_EXIST = "NO"           
R_EXIST = "NO"           

}

/!/       { VALUES = gsub(" ", "", $0);
            getline;

            if ($1 == ">") L_EXIST = "YES";
            else if ($1 == "<") R_EXIST = "YES";

            print VALUES"#"L_EXIST"#"R_EXIST

           }

END {

}

Upvotes: 0

Views: 248

Answers (2)

Tom Fenech
Tom Fenech

Reputation: 74695

Given your sample input:

$ cat file
!{ID=34, ID2=35}
>
!{ID=99, ID2=23}
>
!{ID=18, ID2=87}
<

This script produces the desired output:

BEGIN { FS="[}{=, ]+"; RS="!" }
NR > 1 { printf "ID=%d,ID=%d#%s\n", $3, $5, ($6==">"?"YES#NO":"NO#YES") }

The Field Separator is set to consume the spaces and other characters between the parts of the line that you're interested in. The Record Separator is set to !, so that each pair of lines is treated as a single record.

The first record is empty (the start of the first line, up to the first !), so we only process the ones after that. The output is constructed using printf, with a ternary to determine the last part (I assume that there are only two options, > or <).

Upvotes: 4

hek2mgl
hek2mgl

Reputation: 158220

Let's say you have this input:

input.txt

!{ID=34, ID2=35}
!{ID=36, ID2=37}
>

You can use the following awk command

awk -F'[!{}, ]' 'NR>1{yn="NO";if($1==">")yn="YES";print l"#"yn}{l=$3","$5}' input.txt

to produce this output:

ID=34,ID2=35#NO
ID=36,ID2=37#YES

Upvotes: 1

Related Questions