Gabriel
Gabriel

Reputation: 113

Java large for performance

I have to find the thousandth number (excluding numbers 3, 4, 7) but it's taking a long time, about 9 seconds to find out how to improve performance.

import java.sql.Date;
import java.text.DecimalFormat;
import java.text.SimpleDateFormat;

public class FirstChallange {

    private static int removedNumbers;
    private static int numberQuantity;
    private static int lastNumberFound; 
    private static int cont;
    private static int pos3;
    private static int pos4;
    private static int pos7;    

    public static void main(String[] args) 
    { 

        long inicio = System.currentTimeMillis();  

        removedNumbers = 0;
        numberQuantity = 10000000;


        for (cont = 1; removedNumbers <= numberQuantity; cont++) {      
            String str = new String(); 
            str = String.valueOf(cont);     

            pos3 = str.indexOf("3");
            pos4 = str.indexOf("4");
            pos7 = str.indexOf("7");

            if((pos3 == -1) && (pos4 == -1) && (pos7 == -1)) {
                removedNumbers++; 
                if(removedNumbers == numberQuantity){ // can not find numbers (3, 4, 7)
                    lastNumberFound = cont; 
                }
            } 
        }       

        DecimalFormat dfmt = new DecimalFormat("0");

        System.out.println(dfmt.format(lastNumberFound)); 

        long fim  = System.currentTimeMillis();   
        System.out.println(new SimpleDateFormat("ss.SSS").format(new Date(fim - inicio)));  


    }
}

Is converting numbers into string and removing them with indexOf the best mode? or is there anything better than indexOf like RabinKarp?

Expected result: 180999565 in 5/4 seconds

Upvotes: 0

Views: 125

Answers (4)

Thomas Bitonti
Thomas Bitonti

Reputation: 1234

Is there a need to search? Working on the comments above, that there are 7 digits, there is a much more efficient direct method, which is to generate the base 7 digits, then remapping them according to the exclusion scheme.

See below for the complete program text which implements three strategies: A faster "search" which uses a byte array instead of strings. This is about the same as the original algorithm, but should be much faster. A base 7 conversion using java formatting, and a direct base 7 conversion, which most likely best satisfies the original problem requirements.

Here is just the "Direct" implementation. This implementation generates the target number as a base 7 number, then remaps the digits according to the scheme to exclude digits 3, 4, and 7:

public static int findNumberDirect(int index) {
    int base = 7;

    byte[] digits = new byte[10];  // Storage for the base 7 digits.

    // Convert to base 7 by generating digits for each power of 7.

    for ( int nextDigit = 0; index > 0; nextDigit++ ) {
        int nextRem = index % base;
        index = index / base;

        digits[nextDigit] = (byte) nextRem;
    }

    // Remap the digits from base 7 to base 10 with exclusions.
    // This could be done in the prior loop.  I've kept this as
    // a separate step for clarity.

    int number = 0;

    int numDigits = digits.length;
    for ( int nextDigit = 0; nextDigit < numDigits; nextDigit++ ) {
        number *= 10;
        number += DIGIT_MAPPING[ digits[ numDigits - nextDigit - 1 ] ];
    }

    return number;
}

Here are timings as output by the program:

Index [        1 ] Search [        1 ] [   2100 (ns) ] Convert [        1 ] [   4500 (ns) ] Direct [        1 ] [   1300 (ns) ] 
Index [        4 ] Search [        6 ] [   1400 (ns) ] Convert [        6 ] [   1900 (ns) ] Direct [        6 ] [   1000 (ns) ] 
Index [       10 ] Search [       15 ] [   1900 (ns) ] Convert [       15 ] [   2000 (ns) ] Direct [     15 ] [   1000 (ns) ] 
Index [      100 ] Search [      202 ] [  26300 (ns) ] Convert [      202 ] [   2100 (ns) ] Direct [      202 ] [   1000 (ns) ] 
Index [     1000 ] Search [     2929 ] [  98300 (ns) ] Convert [     2929 ] [   2100 (ns) ] Direct [     2929 ] [   1000 (ns) ] 
Index [    10000 ] Search [    61106 ] [ 694300 (ns) ] Convert [    61106 ] [   2100 (ns) ] Direct [    61106 ] [    900 (ns) ]

Here is the complete program text:

package my.tests;

public class NumberCounter {

    // Find the thousandth number (excluding numbers 3, 4, 7).

    public static final String USAGE_TEXT = "Usage: " + NumberCounter.class.getName() + " index*";

    public static final int MAX_INDEX = 10000000;

    public static void main(String[] args) {
        for ( int argNo = 0; argNo < args.length; argNo++ ) {
            String indexText = args[argNo];
            int index = Integer.parseInt(indexText);
            if ( index <= 0 ) {
                System.out.println("Error: Index [ " + indexText + " ] is less than 1.");
                return;
            } else if ( index > MAX_INDEX ) {
                System.out.println("Error: Index [ " + indexText + " ] is greater than " + Integer.toString(MAX_INDEX) + ".");
                return;
            }

            long searchStart = System.nanoTime();
            int searchNumber = findNumberSearch(index);
            long searchEnd = System.nanoTime();

            long convertStart = System.nanoTime();
            int convertNumber = findNumberConvert(index);
            long convertEnd = System.nanoTime();

            long directStart = System.nanoTime();
            int directNumber = findNumberDirect(index);
            long directEnd = System.nanoTime();

            System.out.println("Index [ " + formatAmount((long) index) + " ]" +
                               " Search [ " + formatAmount(searchNumber) + " ] [ " + formatDuration(searchEnd, searchStart) + " ]" +
                               " Convert [ " + formatAmount(convertNumber) + " ] [ " + formatDuration(convertEnd, convertStart) + " ]" +
                               " Direct [ " + formatAmount(directNumber) + " ] [ " + formatDuration(directEnd, directStart) + " ]");                               
        }
    }

    public static String formatDuration(long end, long start) {
        return String.format("%6d (ns)", Long.valueOf(end - start));
    }

    public static String formatAmount(long amount) {
        return String.format("%8d", Long.valueOf(amount));
    }

    private final static byte[] DIGIT_MAPPING = { 0, 1, 2, 5, 6, 8, 9 };

    public static int findNumberSearch(int index) {
        byte[] digits = new byte[10];
        int numDigits = digits.length;

        for ( int nextNumber = 0; nextNumber < index; nextNumber++ ) {
            for ( int nextDigit = 0; nextDigit < numDigits; nextDigit++ ) {
                int digitOffset = numDigits - nextDigit - 1;
                byte digit = digits[digitOffset];
                if ( digit == 6 ) {
                    digit = 0;
                } else {
                    digit++;
                }
                digits[digitOffset] = digit;
                if ( digit != 0 ) {
                    break;
                }
            }
        }

        int number = 0;

        for ( int nextDigit = 0; nextDigit < numDigits; nextDigit++ ) {
            number *= 10;
            number += DIGIT_MAPPING[ digits[nextDigit] ];
        }

        return number;
    }

    public static int findNumberConvert(int index) {
        String numberText = Integer.toString(index, 7);

        int number = 0;

        int numDigits = numberText.length();
        for ( int nextDigit = 0; nextDigit < numDigits; nextDigit++ ) {
            number *= 10;
            number += DIGIT_MAPPING[ numberText.charAt(nextDigit) - '0' ];
        }

        return number;
    }

    public static int findNumberDirect(int index) {
        int base = 7;

        byte[] digits = new byte[10];

        for ( int nextDigit = 0; index > 0; nextDigit++ ) {
            int nextRem = index % base;
            index = index / base;

            digits[nextDigit] = (byte) nextRem;
        }

        int number = 0;

        int numDigits = digits.length;
        for ( int nextDigit = 0; nextDigit < numDigits; nextDigit++ ) {
            number *= 10;
            number += DIGIT_MAPPING[ digits[ numDigits - nextDigit - 1 ] ];
        }

        return number;
    }
}

Upvotes: 2

tucuxi
tucuxi

Reputation: 17955

Following on MrSmith's answer, you can observe that, once you translate the expected output, 180999565, using the lookup tables and function below ...

                       // 0, 1, 2, 3, 4, 5, 6, 7, 8, 9
static int b10_to_b7[] = {0, 1, 2,-1,-1, 3, 4,-1, 5, 6}; // avoids 3, 4 and 7
static int b7_to_b10[] = {0, 1, 2, 5, 6, 8, 9,-1,-1,-1}; // undoes above permutation 

// extracts digits in base b, replacing them according to p
static long permuteDigits(long input, long base, int[] p) {
    long output = 0;
    long shift = 1;
    while (input > 0) {
       int digit = (int)(input % base);
       input /= base;
       output += p[digit] * shift;
       shift *= base;
    }
    return output;
}

// converts input digits in base ibase into obase
static long changeBase(long input, long ibase, long obase) {
    long output = 0;
    long shift = 1;
    while (input > 0) {
       int digit = (int)(input % ibase);
       input /= ibase;
       output += digit * shift;
       shift *= obase;
    }
    return output;
}


permuteDigits(180999565L, 10, b10_to_b7); // 150666343
changeBase(150666343L, 10, 7);            // 10000000  !!

So, using the same functions, you can find your result directly by reversing the operations:

changeBase(10000000L, 7, 10);             // 150666343
permuteDigits(150666343L, 10, b7_to_b10); // 180999565

And this takes only O(log(n)) operations, which is much quicker than any iterative approach. Your original code was O(n log(n))

Note that you can optimize my code, by merging both functions. But that will win you only about a x2 speed improvement: nothing as drastic as the x10M improvement from avoiding iteration.

Upvotes: 0

MrSmith42
MrSmith42

Reputation: 10161

You should think about the problem in a different way.

"Generate all numbers below numberQuantity that do not contain 3, 4, 7"

Or with other words: "Create numbers only containing 0,1,2,5,6,8,9 as digits".

These are 7 different digits.

So one approach could be increment a counter starting with 1 and convert it to a represenattion base 7 where you map each digit like this:

0->0
1->1
2->2
3->5
4->6
5->8
6->9

Upvotes: 3

Francesc Recio
Francesc Recio

Reputation: 2235

Every time you do an indexOf, you are looping inside a String to see if it contains a character.

Then you are doing 3 "indexOf", so at each turn of loop your processing in that part is multiplied by 3.

We can consider putting these 3 checks together in a single loop, and thus improve performance, for example by creating a method that emulates "indexOf" but where you pass the 3 validations:

import java.sql.Date;
import java.text.DecimalFormat;
import java.text.SimpleDateFormat;

public class FirstChallange {

    private static int removedNumbers=0;
    private static int numberQuantity;
    private static int lastNumberFound; 
    private static int cont;

    private static char three = '3';
    private static char four = '4';
    private static char seven = '7';

    private static SimpleDateFormat sf = new SimpleDateFormat("ss.SSS");
    private static DecimalFormat dfmt = new DecimalFormat("0");

    public static void main(String[] args) 
    { 

        long inicio = System.currentTimeMillis();  

        numberQuantity = 10000000;

        String str; 

        for (cont = 1; removedNumbers <= numberQuantity; cont++) {      

            //Faster way to create String
            str = "" + cont;

            if(!contains(str.toCharArray())) {
                removedNumbers++; 
                if(removedNumbers == numberQuantity){ // can not find numbers (3, 4, 7)
                    lastNumberFound = cont; 
                }
            } 
        }

        System.out.println(dfmt.format(lastNumberFound)); 

        long fim  = System.currentTimeMillis();
        System.out.println(sf.format(new Date(fim - inicio)));  

    }

    public static boolean contains(char[] chars) {

        for (int i = 0; i < chars.length; i++) {
             if(chars[i] == three) {
                return true;
            } else if (chars[i] == four) {
                return true;
            } else if(chars[i] == seven) {
                return true;
            }
        }
        return false;
    }

}

The contains method I've created here, checks if the character we don't want is present, and if it is, we don't check anymore and we return true.

I've reduced the performance by 3 or 4 seconds.

Upvotes: 0

Related Questions