Reputation: 55
I have a file such as
1,ab012a800,20141205
2,ab023a801,20141205
3,ab012a802,20141205
1,ab024a803,20141205
1,ab012a804,20141205
I want to extract the 'ab012a' part and append that to the end of the line.
1,ab012a800,20141205,ab012a
2,ab023a801,20141205,ab023a
3,ab012a802,20141205,ab012a
1,ab024a803,20141205,ab024a
1,ab012a804,20141205,ab012a
I can extract with grep :
grep -o '^[a-z][a-z][0-9]*[a-z]' file
and append to a line with sed :
sed "s/$/,whatever/"
or even replace the pattern with sed :
sed '/^[a-z][a-z][0-9]*[a-z]/ s/$/something/' file
but how would I append the matching pattern to the end of the line?
Many thanks
Upvotes: 4
Views: 1481
Reputation: 1517
awk '{print $1"," substr($0,3,6)}' file
1,ab012a800,20141205,ab012a
2,ab023a801,20141205,ab023a
3,ab012a802,20141205,ab012a
1,ab024a803,20141205,ab024a
1,ab012a804,20141205,ab012a
Upvotes: 0
Reputation:
GAWK way
awk 'match($0,/[a-z][a-z][0-9]+[a-z]/,a){print $0","a[0]}' file
Matches the string then prints the line and matched string
Alternative portable awk way(courtesy of EdMorton)
awk 'match($0,/[a-z][a-z][0-9]+[a-z]/{$0=$0","substr($0,RSTART,RLENGTH)}1' file
And with character class for maximum portability
awk 'match($0,/[[:lower:]][[:lower:]][[:digit:]]+[[:lower:]]/{
$0=$0","substr($0,RSTART,RLENGTH)}1' file
Upvotes: 2
Reputation: 5092
You can use this GNU awk
:
awk -F"," '{print $1","$2","$3"," gensub(/(.*)(...$)/, "\\1", "g", $2)}' FileName
Output :
1,ab012a800,20141205,ab012a
2,ab023a801,20141205,ab023a
3,ab012a802,20141205,ab012a
1,ab024a803,20141205,ab024a
1,ab012a804,20141205,ab012a
Upvotes: 1
Reputation: 10039
sed 's/.\(.\{7\}\).*/&\1/' YourFile
without any other constraint and based on this sample ...
Upvotes: 0
Reputation: 784898
You can use:
sed -i.bak 's/\(,[a-z][a-z][0-9]*[a-z]\).*$/&\1/' file
1,ab012a800,20141205,ab012a
2,ab023a801,20141205,ab023a
3,ab012a802,20141205,ab012a
1,ab024a803,20141205,ab024a
1,ab012a804,20141205,ab012a
&
is special symbol in replacement that represents full matched string by regex used and \1
represents the matched group #1.
Upvotes: 5
Reputation: 184965
With capture groups :
$ sed -r 's@^([0-9]+,)(ab[0-9]+[a-z]+)(.*)@\1\2\3,\2@g' file
1,ab012a800,20141205,ab012a
2,ab023a801,20141205,ab023a
3,ab012a802,20141205,ab012a
1,ab024a803,20141205,ab024a
1,ab012a804,20141205,ab012a
Upvotes: 0