Reputation: 199
I have this input file
gb|KY798440.1|
gb|KY842329.1|
MG082893.1
MG173246.1
and I want to get all the characters that are between the "|" or the full line if there is no "|". That is a desired output that looks like
KY798440.1
KY842329.1
MG082893.1
MG173246.1
I wrote:
while IFS= read -r line; do
if [[ $line == *\|* ]] ; then
sed 's/.*\|\(.*\)\|.*/\1/' <<< $line >> output_file
else echo $line >> output_file
fi
done < input_file
Which gives me
empty line
empty line
MG082893.1
MG173246.1
(note: empty line
means an actual empty line - it doesn't actually writes "empty line")
The sed command works on a single example (i.e. sed 's/.*\|\(.*\)\|.*/\1/' <<< "gb|KY842329.1|"
outputs KY842329.1
) but within the loop it just does a line return. The else echo $line >> output_file
seems to work.
Upvotes: 1
Views: 35
Reputation: 212634
You could do
sed '/|/s/[^|]*|\([^|]*\)|.*/\1/' input
or
awk 'NF>1 {print $2} NF < 2 { print $1}' FS=\| input
or
sed -e 's/[^|]*|//' -e 's/|.*//' input
Upvotes: 0
Reputation: 37464
Bare sed:
$ sed 's/^[^|]*|\||[^|]*$//g' file
Output:
KY798440.1
KY842329.1
MG082893.1
MG173246.1
Upvotes: 2