Reputation: 411
$ echo "ABC XYZ 12/123/52/ ABBDBDAD 562.4224.32 02381831522" | sed 's/[^a-zA-Z]//g' raw.tmp
Using the above, I am trying to extract ABC XYZ from a line (spaces preserved). My regex returns ABCXYZABBDBDAD: I am a noob at regex and still have a lot to learn.
In summary, how do I get the substring ABC XYZ from a line before a number with whitespace preceding it?
Upvotes: 1
Views: 440
Reputation: 730
You need to write following
echo "ABC XYZ 12/123/52/ ABBDBDAD 562.4224.32 02381831522" | sed 's/.*\(ABC XYZ\).*/\1/g'
Output
ABC XYZ
Point is - I believe you are trying to extract 'ABC XYZ' (exactly). So you extract that and substitute entire line with it
Edit I think i missed the point. You basically want 'Str1 Str2 '
In that case following works
echo "ABC XYZ 12/123/52/ ABBDBDAD 562.4224.32 02381831522" | sed 's/\([a-zA-Z ][a-zA-Z ]*\).*/\1/g'
Upvotes: 0
Reputation: 290045
This can make it:
$ echo "ABC XYZ 12/123/52/ ABBDBDAD 562.4224.32 02381831522" | sed -n 's/\([A-Z]* [A-Z]*\) [0-9]*.*/\1/p'
ABC XYZ
sed -n 's/\([A-Z]* [A-Z]*\) [0-9]*.*/\1/p'
\([A-Z]* [A-Z]*\) == catch WORD + space + WORD
[0-9]*.* == some number + space + rest of string
/\1/p == print catched string
Upvotes: 3
Reputation: 75548
Perhaps this one
echo "ABC XYZ 12/123/52/ ABBDBDAD 562.4224.32 02381831522" | sed -n 's/^\([a-zA-Z ]\+\).*/\1/gp' > raw.tmp
Or more accurately
echo "ABC XYZ 12/123/52/ ABBDBDAD 562.4224.32 02381831522" | sed -n 's/^\([a-zA-Z][a-zA-Z ]\+[a-zA-Z]\).*/\1/gp'
Which restricts characters that begin with letters and ends up with letters as well.
Upvotes: 2