j4nd3r53n
j4nd3r53n

Reputation: 840

How can I pick out a token from a string, in ksh/bash?

I imagine this must have been asked several times, but I simply can't find a good match; sometimes you just can't think of the right search, I suppose.

So, this is my problem - I want to analyse SQL strings like the following in a script:

select * from my_table where col1 = #ABC# and col2 like "%#DEF#%";

I want to, somehow, fish out the tokens ABC and DEF by parsing for the delimiting #s. Trying with sed, I get something like:

# echo "something#ABC#else" | sed 's/.*\(#..*#\).*/\1/g'
#ABC#

but that only catches one, if there are more:

# echo "something#ABC#else something#DEF#else" | sed 's/.*\(#..*#\).*/\1/g'
#DEF#

It seems I'm pursuing the wrong lines here - is there a better way?

Upvotes: 0

Views: 371

Answers (2)

markp-fuso
markp-fuso

Reputation: 34876

Assumptions:

  • there are an even number of delimiters (#), eg, '#ABC#def#GHI#' would be valid but '#ABC#def#' would not be valid
  • the output should not include the delimiter
  • each parsed token is placed on a separate/new line
  • we're interested in ALL characters (not just letters/numbers) that fall between a pair of delimiters

With an even number of delimiters we can have awk display the even numbered fields, eg:

$ echo "something#ABC#else something#DEF#else" | awk -F"#" '{ for (i=2; i<=NF; i+=2) { print $i } }'
ABC
DEF
  • -F"#" - designate # as awk's input field delimiter
  • for (i=2; i<=NF; i+=2) - loop through our even numbered fields using i as our index
  • print $i - print the ith field

Or if you want to eliminate the subshell (invoked by echo ... |) you could use a here string:

$ awk -F"#" '{ for (i = 2; i <= NF; i+=2) { print $i } }' <<< "something#ABC#else something#DEF#else"
ABC
DEF

Upvotes: 1

iamauser
iamauser

Reputation: 11479

$ echo "something#ABC#else something#DEF#else" | grep -oP '(?<=#)[A-Z0-9a-z]+(?=#)'
ABC
DEF

Using grep lookbehind

Upvotes: 1

Related Questions