Reputation: 1117
I have a file with a list of ids as follows
OG1: apple|fruits_1 cucumber|veg_1 apple|fruits_1 carrot|veg_2
OG2: apple|fruits_5 cucumber|veg_1 apple|fruits_1 pineapple|fruit_2
OG3: cucumber|veg_1 apple|fruits_9 carrot|veg_2
OG4: apple|fruits_3 cucumber|veg_1 apple|fruits_4 pineapple|fruit_7
OG5: pineapple|fruit_2 pineapple|fruit_2 apple|fruits_1 pineapple|fruit_2
OG6: apple|fruits_5 apple|fruits_1 apple|fruits_6 apple|fruits_7
Now, I want to extract the first occurrence of apple| in each line to give me
OG1: apple|fruits_1
OG2: apple|fruits_5
OG3: apple|fruits_9
OG4: apple|fruits_3
OG5: apple|fruits_1
OG6: apple|fruits_5
I tried
grep -w -m 1 "apple" sample.txt
which only gives me
OG1: apple|fruits_1 cucumber|veg_1 apple|fruits_1 carrot|veg_2
Upvotes: 2
Views: 192
Reputation: 10039
Sed version
sed 's/\([[:blank:]]apple|[^[:blank:]]*\).*/\1/;s/:.*[[:blank:]]apple/: apple/;/apple/!d' YourFile
# assuming blank are space
sed 's/\( apple|[^ ]*\).*/\1/;s/:.* apple/: apple/;/apple/!d' YourFile
Upvotes: 1
Reputation: 31915
If awk
is okay for you:
Save the input lines into sample.csv file.
awk '{for(x=1;x<=NF;x++){if(substr($x,0,6)=="apple|"){print $1, $x; next}}}' sample.csv
substr($x, 0, 6)
equals to "apple|" or not. if it is print the fields by print $1, $x
and use next
for ignoring the rest fields of current line Output:
OG1: apple|fruits_1
OG2: apple|fruits_5
OG3: apple|fruits_9
OG4: apple|fruits_3
OG5: apple|fruits_1
OG6: apple|fruits_5
Upvotes: 3