Leo
Leo

Reputation: 123

Bash Shell - Return substring after second occurrence of certain character

I need to return everything after a delimeter I decide but still don't fully know how to use sed. What I need to do is:

$ echo "ABC  DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,," \
  | sed <some regexp>

For this example the return should be (substring)everything after the second comma:

123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,

I can do this with cut like this: echo "ABC DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,," | cut -d',' -f 2-

but I've been told cut is slower than sed...

Can some guru who has them (and wants to... :) ) give me a few minutes of his time and advice me please? Thanks! Leo

Upvotes: 4

Views: 9034

Answers (3)

Vijay Nirmal
Vijay Nirmal

Reputation: 5837

This method is by find the index of second occurrence of a character and using bash substring to get the required result

input="ABC  DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,"
index=$(($(echo $input| grep -aob '/' | grep -oE '[0-9]+' | awk 'NR==2') + 1))
result=${input:$index}

Upvotes: 0

chepner
chepner

Reputation: 531055

You could also try doing the extraction in bash without spawning an external process at all:

$ [[ 'ABC  DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,' =~ [^,]*,[^,]*,(.*) ]]
$ echo "${BASH_REMATCH[@]}"
123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,

or

$ FOO='ABC  DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,'
$ echo ${FOO/+([^,]),+([^,]),}

or

$ IFS=, read -a FOO <<< 'ABC  DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,'
$ echo ${FOO[@]:2}

(Assuming this is for a one-off match, not iterating over the contents of a file.)

Upvotes: 0

Thor
Thor

Reputation: 47099

In my experience cut is always faster than sed.

To do what you want with sed you could use a non-matching group:

echo 'ABC  DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,' |
  sed -r 's/([^,]*,){2}//'

This removes the first two fields (if the fields do not contain commas themselves) by removing non-comma characters [^,] followed by a comma twice {2}.

Output:

123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,

Upvotes: 3

Related Questions