Paul
Paul

Reputation: 1117

Grep first occurence in each line

I have a file with a list of ids as follows

OG1: apple|fruits_1 cucumber|veg_1 apple|fruits_1  carrot|veg_2
OG2: apple|fruits_5 cucumber|veg_1 apple|fruits_1  pineapple|fruit_2
OG3: cucumber|veg_1 apple|fruits_9  carrot|veg_2
OG4: apple|fruits_3 cucumber|veg_1 apple|fruits_4  pineapple|fruit_7
OG5: pineapple|fruit_2 pineapple|fruit_2 apple|fruits_1 pineapple|fruit_2
OG6: apple|fruits_5 apple|fruits_1 apple|fruits_6  apple|fruits_7

Now, I want to extract the first occurrence of apple| in each line to give me

 OG1: apple|fruits_1
 OG2: apple|fruits_5
 OG3: apple|fruits_9
 OG4: apple|fruits_3
 OG5: apple|fruits_1
 OG6: apple|fruits_5

I tried

  grep -w -m 1 "apple" sample.txt

which only gives me

  OG1: apple|fruits_1 cucumber|veg_1 apple|fruits_1  carrot|veg_2

Upvotes: 2

Views: 192

Answers (2)

NeronLeVelu
NeronLeVelu

Reputation: 10039

Sed version

sed 's/\([[:blank:]]apple|[^[:blank:]]*\).*/\1/;s/:.*[[:blank:]]apple/: apple/;/apple/!d' YourFile

# assuming blank are space
sed 's/\( apple|[^ ]*\).*/\1/;s/:.* apple/: apple/;/apple/!d' YourFile

Upvotes: 1

Haifeng Zhang
Haifeng Zhang

Reputation: 31915

If awk is okay for you:

Save the input lines into sample.csv file.

 awk '{for(x=1;x<=NF;x++){if(substr($x,0,6)=="apple|"){print $1, $x; next}}}' sample.csv
  • use for loop to iterate fields of each line
  • checks the substring substr($x, 0, 6) equals to "apple|" or not. if it is print the fields by print $1, $x and use next for ignoring the rest fields of current line

Output:

OG1: apple|fruits_1
OG2: apple|fruits_5
OG3: apple|fruits_9
OG4: apple|fruits_3
OG5: apple|fruits_1
OG6: apple|fruits_5

Upvotes: 3

Related Questions