Reputation: 3784
Given input is always going to start with 0 and then follow as:
c1 + occurrence where c1 is the character and occurrence is the sequence of the same character repeating.
For example aabbaaacccc
becomes 0a2b2a3c4
, characters will be always lowercase a-z.
Now my issue is given input as:
0x1k1c4t11g3d1m1d1j10f1v1n3e2r3i1e2a1h4a2e1y1z2e1s1a1q1j2r1k2t3h1i1f4j1d2m4p3
However when I use String.split()
and iterate through the results I am getting empty strings. I tried using both split("[0-9]")
and split("[^a-z]")
but result does not change.
The iteration result for my example is:
x
k
c
t
g
d
m
d
j
f
v
n
e
r
i
e
a
h
a
e
y
z
e
s
a
q
j
r
k
t
h
i
f
j
d
m
p
Is this a bug in JDK or is there something wrong with my regex?
Upvotes: 0
Views: 141
Reputation: 16373
You need to split by a zero-length pattern. A lookahead assertion, which is non-capturing, is the way to go:
String str = "0d1j10f1";
str.split("(?=[a-z])");
// result: ["0", "d1", "j10", "f1"]
Also, as pointed out in other answers, keep in mind numbers can be multiple-digit.
Upvotes: 0
Reputation: 21
I don't think it is good idea to use String split()
in this case. it will be better if you use substring
and charAt()
in this case, then try to consider a loop that is going to count number of occurrences.
Upvotes: 0
Reputation: 26
The problem here seems to be that you split by a regex that's is of exactly one number or char, you have an empty string when you get a two digit number for example t11 you get t1 and you loose the final 1, if you want to take the whole number you need to put + after what the regex is looking for, for example in this case you should put. split("[0-9]+") and you would get the whole numbers no matter how many digits you have.
Upvotes: 1
Reputation: 4111
As Vulcan states in his comment, this is caused due to the existence of at least two consecutive numbers which result in empty String. Maybe you would like to remove any numbers from the String first or obviously, remove the empty Strings in your resulted array. for instance:
s = "0x1k1c4t11g3d1m1d1j10f1v1n3e2r3i1e2a1h4a2e1y1z2e1s1a1q1j2r1k2t3h1i1f4j1d2m4p3";
s = s.replaceAll("\\d","");
Upvotes: 0