Reputation: 32306
Format:
[Headword]{}"UC(icl>restriction)"(Attributes);(gloss)
The testme.txt file has 2 lines
[testme] {} "acetify" (V,lnk,CJNCT,AJ-V,VINT,VOO,VOO-CHNG,TMP,Vo) <H,0,0>;
[newtest] {} "acid-fast" (ADJ,DES,QUAL,TTSM) <H,0,0>;
The expected output is this:
testme = acetify
newtest = acid-fast
What I have achieved so far is:
cat testme.txt | sed 's/[//g' | sed 's/]//g' | sed 's/{}/=/g' | sed 's/\"//'
testme = acetify" (V,lnk,CJNCT,AJ-V,VINT,VOO,VOO-CHNG,TMP,Vo) <H,0,0>;
newtest = acid-fast" (ADJ,DES,QUAL,TTSM) <H,0,0>;
How do I remove all the text from the second " to the end of the line?
Upvotes: 1
Views: 312
Reputation: 359905
Your whole sequence of multiple calls to sed
can be replaced by:
sed 's/\[\([^]]*\)][^"]*"\([^"]*\).*/\1 = \2/' inputfile
Upvotes: 1
Reputation: 342303
this is how you do it with awk instead of all those sed
commands, which is unnecessary. what you want is field 1 and field 3. use gsub()
to remove the quotes and brackets
$ awk '{gsub(/\"/,"",$3);gsub(/\]|\[/,"",$1);print $1" = "$3}' file
testme = acetify
newtest = acid-fast
Upvotes: 1
Reputation: 131550
The whole process might be a little quicker with awk
:
awk 'NF > 0 { print $1 " = " $3 }' testme.txt | tr -d '[]"'
Upvotes: 1
Reputation: 39763
Remove everything after the doublequote-space-openparenthesis " (
:
sed 's/" (.*//g'
Upvotes: 1