Reputation: 77
After something I guess is pretty complex, and I am pretty bad with regex's so you guys might be able to help.
See this data source:
User ID:
a123456
a12345f
a1234e6
d123d56
b12c456
c1b3456
ba23456
Basically, what I want to do, is use a regex/sed to replace all occurances of letters into numbers EXCEPT the first letter. Letters will always match their alphabet position. e.g. a = 1, b = 2, c = 3 etc.
So the result set should look like this:
User ID:
a123456
a123456
a123456
d123456
b123456
c123456
b123456
There will also never be any letters other that a-j, and the string will always be 7 chars long.
Can anyone shed some light? Thanks! :)
Upvotes: 0
Views: 4450
Reputation: 15877
I don't see the complexity. Your samples look like you just want to replace six of seven characters with the numbers 1-6:
s/^\([a-j0-9]\)[a-j0-9]\{6\}/\1123456/
Since the numbers to put there are defined by position, we don't care what the letter was (or even if it was a letter). The downside here is that we don't preserve the numbers, but they never varied in your sample data.
If we want to replace only letters, the first method I can think of involves simply using multiple substitutions:
s/^\([a-j0-9]\{1\}\)[a-j]/\11/
s/^\([a-j0-9]\{2\}\)[a-j]/\12/
s/^\([a-j0-9]\{3\}\)[a-j]/\13/
s/^\([a-j0-9]\{4\}\)[a-j]/\14/
s/^\([a-j0-9]\{5\}\)[a-j]/\15/
s/^\([a-j0-9]\{6\}\)[a-j]/\16/
Replacing letters with specific digits, excluding the first letter:
s/\(.\)a/\11/g
This pattern will replace two character sequences, preserving the first, so would have to be run twice for each letter. Using hold space we could store the first character and use a simple transliteration. The tricky part is joining the two sections, whereupon sed injects an unwanted newline.
# Store in hold space
h
# Remove the first character
s/^.//
# Transliterate letters
y/jabcdefghi/0123456789/
# Exchange pattern and hold space
x
# Keep the first character
s/^\(.\).*$/\1/
# Print it
#P
# Join
G
# Remove the newline
s/^\(.\)./\1/
Still learning about sed's capabilities :)
Upvotes: 0
Reputation: 203129
$ awk 'BEGIN{FS=OFS=""} NR>1{for (i=2;i<=NF;i++) if(p=index("jabcdefghi",$i)) $i=p-1} 1' file
User ID:
a123456
a123456
a123456
d123456
b123456
c123456
b123456
Note that the above reproduces the header line User ID:
as-is. So far, best I can tell, all of the other posted solutions would change the header line to Us5r ID:
since they would do the letter-to-number translation on it just like on all of the subsequent lines.
Upvotes: 1
Reputation: 56
To replace a-j
letters in a line by the corresponding digits except the first letter using perl
:
$ perl -pe 'substr($_, 1) =~ tr/a-j/0-9/' input_file
a=0
, not a=1
because j
would be 10
(two digits) otherwise.
J = 0, and no, only numbers 0-9 are used, and letters simply replace their number counterpart, so there will never be a latter greater than j.
To make j=0
and a=1
:
$ perl -pe 'substr($_, 1) =~ tr/ja-i/0-9/' input_file
Upvotes: 3
Reputation: 10039
sed '/[a-j][0-9a-j]\{6\}$/{h;y/abcdefghij/1234567890/;G;s/.\(.\{6\}\).\(.\).*/\2\1/;}' YourFile
Upvotes: 2
Reputation: 74596
Here's one way you could do it using standard tools cut
, paste
and tr
:
$ paste -d'\0' <(cut -c1 file) <(cut -c2- file | tr 'abcdef' '123456')
a123456
a123456
a123456
d123456
b123456
c123456
b123456
This joins the first character of the line with the result of tr
on the rest of the line, using the null string. tr
replaces each element found in the first list with the corresponding element of the second list.
Upvotes: 3