Juns
Juns

Reputation: 521

Performance issue in String to Number conversion

I have space separated string containing numbers in between like:

"abc123 ws32wd3 y3tg43 5tga89 a1a"

I have to parse the string to get the numbers from each token and then sum up all the digits extracted from tokens. I have written below code, but what I think is, if there is huge string, then there might be performance issue.

So, my questions are:

  1. How can we improve the performance in below code?

  2. Do we have another way to write the below code to solve the problem?

Code:

public class TestSum {

    public static int doSum(String str){
        String[] sArray = str.split(" ");
        char[] chr = null;
        String temp;
        String number = "";
        int sum=0;
        for(String s : sArray){
            chr = s.toCharArray();
            for(char c : chr){
                temp = String.valueOf(c);
                if(isNum(temp)){
                    number = number + temp;
                }           
            }
            sum = sum + Integer.parseInt(number);
            number="";
        }       
        return sum;
    }

    public static boolean isNum(String nStr){   
        try{
            Integer.parseInt(nStr);
            return true;
        }catch(NumberFormatException nfe){
            return false;
        }       
    }

    public static void main(String[] args) {
        System.out.println("Sum is "+ TestSum.doSum("abc123 ws32wd3 y3tg43 5tga89 a1a"));
    }
} 

Upvotes: 2

Views: 391

Answers (6)

Adam Stelmaszczyk
Adam Stelmaszczyk

Reputation: 19837

This is the fastest I could think of:

public static int getSum(String str) 
{
    int sum = 0;
    int exp = 1;      
    for (int i = str.length() - 1; i >= 0; i--) 
    {
        final char c = str.charAt(i);
        if (c >= '0' && c <= '9')
        {
            sum += (c - '0') * exp;
            exp *= 10;
        }
        else
        {
            exp = 1;
        }
    }
    return sum;
}

It iterates through string from right to left. Thanks to that, when it "sees" a digit it can add appropriate value, depending on the decimal position "seen" in the number.

Benchmark using Caliper

Results are different than in davecom's benchmark:

AUTHOR       RUNTIME (NS)   HOW MANY TIMES FASTER THAN JUNS
-----------------------------------------------------------
Adam              66.221                                600
Old              579.873                                 70
Prabhakaran   20,012.750                                  2 (2x faster than Juns)
Juns          39,681.074                                  1

Upvotes: 5

davecom
davecom

Reputation: 1498

You can start improving the speed of the code by eliminating your isNum() method and using the built in Character.isDigit() method.

You may be able to further improve the speed by using a regular expression to extract the numbers out of each token instead of doing it with the loops.

Best of luck.

EDIT

Comparing the performance of some of the answers here, it would seem that @Prabhakaran's answer is slower than the original, while @OldCurmudgeon's is faster, and @Adam Stelmaszczyk's is the fastest :

import java.util.*;

public class TestSum {

    public static int doSum(String str){
        String[] sArray = str.split(" ");
        char[] chr = null;
        String temp;
        String number = "";
        int sum=0;
        for(String s : sArray){
            chr = s.toCharArray();
            for(char c : chr){
                temp = String.valueOf(c);
                if(isNum(temp)){
                    number = number + temp;
                }           
            }
            sum = sum + Integer.parseInt(number);
            number="";
        }       
        return sum;
    }

    public static boolean isNum(String nStr){   
        try{
            Integer.parseInt(nStr);
            return true;
        }catch(NumberFormatException nfe){
            return false;
        }       
    }

    public static void testSum1(){
        String str = "abc123 ws32wd3 y3tg43 5tga89 a1a";      
        str = str.replaceAll("[^0-9]+", " ");
        List<String> asList = Arrays.asList(str.trim().split(" "));
        int sum=0;      
        for (String string : asList) {
            sum+=Integer.parseInt(string);
        }
        System.out.println(sum);
    }

    public static int doSum2(String str) {
        int sum = 0;
        // -1 means not started.
        int start = -1;
        for ( int i = 0; i < str.length(); i++ ) {
            char ch = str.charAt(i);
            if ( Character.isDigit(ch)) {
                if ( start == -1 ) {
                    // Start of a number.
                    start = i;
                }
            } else {
                if ( start != -1 ) {
                    // End of a number.
                    sum += Integer.parseInt(str.substring(start, i));
                    start = -1;
                }
            }
        }
        if ( start != -1 ) {
            // A number at the end of the string.
            sum += Integer.parseInt(str.substring(start, str.length()));
        }
        return sum;
    }

    public static int getSum(String str) {
        int sum = 0;
        int exp = 1;      
        for (int i = str.length() - 1; i >= 0; i--) {
            final char c = str.charAt(i);
            if (c >= '0' && c <= '9'){
                sum += (c - '0') * exp;
                exp *= 10;
            }
            else{
                exp = 1;
            }
        }
        return sum;
    }

    public static void main(String[] args) {
        long startTime = System.nanoTime();
        TestSum.testSum1();
        long endTime = System.nanoTime();
        System.out.println("testSum1 took " + (endTime - startTime) + " nanoseconds");

        startTime = System.nanoTime();
        System.out.println(TestSum.doSum("abc123 ws32wd3 y3tg43 5tga89 a1a"));
        endTime = System.nanoTime();
        System.out.println("doSum took " + (endTime - startTime) + " nanoseconds");

        startTime = System.nanoTime();
        System.out.println(TestSum.doSum2("abc123 ws32wd3 y3tg43 5tga89 a1a"));
        endTime = System.nanoTime();
        System.out.println("doSum2 took " + (endTime - startTime) + " nanoseconds");

        startTime = System.nanoTime();
        System.out.println(TestSum.getSum("abc123 ws32wd3 y3tg43 5tga89 a1a"));
        endTime = System.nanoTime();
        System.out.println("getSum took " + (endTime - startTime) + " nanoseconds");
    }
} 

Here is the output

Davids-MacBook-Air:desktop dave$ javac TestSum.java
Davids-MacBook-Air:desktop dave$ java TestSum
299
testSum1 took 1790000 nanoseconds
1379
doSum took 373000 nanoseconds
299
doSum2 took 173000 nanoseconds
299
getSum took 45000 nanoseconds

Upvotes: 4

Prabhakaran Ramaswamy
Prabhakaran Ramaswamy

Reputation: 26094

    String str = "abc123 ws32wd3 y3tg43 5tga89 a1a";      
    str = str.replaceAll("[^0-9]+", " ");
    List<String> asList = Arrays.asList(str.trim().split(" "));
    int sum=0;      
    for (String string : asList) {
        sum+=Integer.parseInt(string);
    }
    System.out.println(asList);
    System.out.println(sum);

Output

str = [123, 32, 3, 3, 43, 5, 89, 1]

sum = 299

Upvotes: 3

OldCurmudgeon
OldCurmudgeon

Reputation: 65793

For maximum performance you could try something like this:

public static int doSum(String str) {
  int sum = 0;
  // -1 means not started.
  int start = -1;
  for ( int i = 0; i < str.length(); i++ ) {
    char ch = str.charAt(i);
    if ( Character.isDigit(ch)) {
      if ( start == -1 ) {
        // Start of a number.
        start = i;
      }
    } else {
      if ( start != -1 ) {
        // End of a number.
        sum += Integer.parseInt(str.substring(start, i));
        start = -1;
      }
    }
  }
  if ( start != -1 ) {
    // A number at the end of the string.
    sum += Integer.parseInt(str.substring(start, str.length()));
  }
  return sum;
}

prints 299 which my calculator confirms is 123+32+3+3+43+5+89+1

Upvotes: 3

Adrian
Adrian

Reputation: 5681

I think in order to speed up your conversion you can use the following trick:
int representation of number = character representation of number - '0'

So int 5 = char 5 - '0'
or in other words
int 5 = '5' - '0'

This is because of how the ASCII table is indexed.

Some (untested) code I wrote super fast to illustrate:

for(int i=0; i<str.length(); i++){
  if (!(str.charAt(i).isDigit()) continue;

  do {
    //now handle digit parsing into a number
    crtNumber= crtNumber*10 + str.charAt(i)-'0'
    i++
  } while(str.charAt(i).isDigit());

  queue.push(crtNumber);//save the number somewhere
  crtNumber= 0; //prepare for next round
}

Upvotes: 0

Eel Lee
Eel Lee

Reputation: 3543

Easier solution would be parse that string with regex \d finding digits, and then go through the new string (which contains only digits) and sum up every sign (digit) in that string.

You won't even have to check if you are summing up digits, because regex will do it for you.

Upvotes: 0

Related Questions