Reputation: 123
I need to return everything after a delimeter I decide but still don't fully know how to use sed. What I need to do is:
$ echo "ABC DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,," \
| sed <some regexp>
For this example the return should be (substring)everything after the second comma:
123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,
I can do this with cut like this:
echo "ABC DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,," | cut -d',' -f 2-
but I've been told cut is slower than sed...
Can some guru who has them (and wants to... :) ) give me a few minutes of his time and advice me please? Thanks! Leo
Upvotes: 4
Views: 9034
Reputation: 5837
This method is by find the index of second occurrence of a character and using bash substring to get the required result
input="ABC DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,"
index=$(($(echo $input| grep -aob '/' | grep -oE '[0-9]+' | awk 'NR==2') + 1))
result=${input:$index}
Upvotes: 0
Reputation: 531055
You could also try doing the extraction in bash
without spawning an external process at all:
$ [[ 'ABC DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,' =~ [^,]*,[^,]*,(.*) ]]
$ echo "${BASH_REMATCH[@]}"
123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,
or
$ FOO='ABC DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,'
$ echo ${FOO/+([^,]),+([^,]),}
or
$ IFS=, read -a FOO <<< 'ABC DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,'
$ echo ${FOO[@]:2}
(Assuming this is for a one-off match, not iterating over the contents of a file.)
Upvotes: 0
Reputation: 47099
In my experience cut
is always faster than sed
.
To do what you want with sed
you could use a non-matching group:
echo 'ABC DE,FG_HI J,123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,' |
sed -r 's/([^,]*,){2}//'
This removes the first two fields (if the fields do not contain commas themselves) by removing non-comma characters [^,]
followed by a comma twice {2}
.
Output:
123.XYZ-A1,DD/MM/YYYY HH24:MI:SS,,,
Upvotes: 3