Reputation: 65
Regex isn't my strongest point. Let's say I need a custom parser for strings which strips the string of any letters and multiple decimal points and alphabets.
For example, input string is "--1-2.3-gf5.47", the parser would return "-12.3547". I could only come up with variations of this :
string.replaceAll("[^(\\-?)(\\.?)(\\d+)]", "")
which removes the alphabets but retains everything else. Any pointers?
More examples: Input: -34.le.78-90 Output: -34.7890
Input: df56hfp.78 Output: 56.78
Some rules:
Upvotes: 2
Views: 279
Reputation: 1546
In terms of regex, the secondary, tertiary, etc., decimals seem tough to remove. However, this one should remove the additional dashes and alphas: (?<=.)-|[a-zA-Z]
. (Hopefully the syntax is the same in Java; this is a Python regex but my understanding is that the language is relatively uniform).
That being said, it seems like you could just run a pretty short "finite state machine"-type piece of code to scan the string and rebuild the reduced string yourself like this:
a = "--1-2.3-gf5.47"
new_a = ""
dash = False
dot = False
nums = '0123456789'
for char in a:
if char in nums:
new_a = new_a + char # record a match to nums
dash = True # since we saw a number first, turn on the dash flag, we won't use any dashes from now on
elif char == '-' and not dash:
new_a = new_a + char # if we see a dash and haven't seen anything else yet, we append it
dash = True # activate the flag
elif char == '.' and not dot:
new_a = new_a + char # take the first dot
dot = True # put up the dot flag
(Again, sorry for the syntax, I think you need some curly backets around the statements vs. Python's indentation only style)
Upvotes: 0
Reputation: 2449
Just tested this on ideone and it seemed to work. The comments should explain the code well enough. You can copy/paste this into Ideone.com and test it if you'd like.
It might be possible to write a single regex pattern for it, but you're probably better off implementing something simpler/more readable like below.
The three examples you gave prints out:
--1-2.3-gf5.47 -> -12.3547
-34.le.78-90 -> -34.7890
df56hfp.78 -> 56.78
import java.util.*;
import java.lang.*;
import java.io.*;
/* Name of the class has to be "Main" only if the class is public. */
class Ideone
{
public static void main (String[] args) throws java.lang.Exception
{
System.out.println(strip_and_parse("--1-2.3-gf5.47"));
System.out.println(strip_and_parse("-34.le.78-90"));
System.out.println(strip_and_parse("df56hfp.78"));
}
public static String strip_and_parse(String input)
{
//remove anything not a period or digit (including hyphens) for output string
String output = input.replaceAll("[^\\.\\d]", "");
//add a hyphen to the beginning of 'out' if the original string started with one
if (input.startsWith("-"))
{
output = "-" + output;
}
//if the string contains a decimal point, remove all but the first one by splitting
//the output string into two strings and removing all the decimal points from the
//second half
if (output.indexOf(".") != -1)
{
output = output.substring(0, output.indexOf(".") + 1)
+ output.substring(output.indexOf(".") + 1, output.length()).replaceAll("[^\\d]", "");
}
return output;
}
}
Upvotes: 1