Ilja
Ilja

Reputation: 46509

Convert set of ascii characters back to string

I currently have a situation where I am converting string to ascii characters:

        String str = "are";  // or anything else

        StringBuilder sb = new StringBuilder();
        for (char c : str.toCharArray())
            sb.append((int)c);

        BigInteger mInt = new BigInteger(sb.toString());
        System.out.println(mInt);

where output (in this case) is 97114101 I am struggling to find a way how to reverse this, convert string of ascii characters back to a string e.g. "are"

Upvotes: 5

Views: 2672

Answers (5)

Thomas
Thomas

Reputation: 17422

As others have pointed out, this is not doable in general. However, as others have also argued, it is doable if you make certain limiting assumptions. In addition to the ones presented already, another assumption could be that the strings you're converting are all English words.

Then you would know that each character takes up either 2 or 3 digits in the integer. The following code exemplifies the use of a function that checks whether 2 digits are OK or whether you have to consider 3 digits:

public String convertBack(BigInteger bigInteger) {
    StringBuilder buffer = new StringBuilder();

    String digitString = bigInteger.toString();

    for (int to, from = 0; from + 2 <= digitString.length(); from = to) {
        // minimally extract two digits at a time
        to = from + 2;
        char c = (char) Integer.parseInt(digitString.substring(from, to));

        // if two digits are not enough, try 3
        if (!isLegalCharacter(c) && to + 1 <= digitString.length()) {
            to++;
            c = (char) Integer.parseInt(digitString.substring(from, to));
        }

        if (isLegalCharacter(c)) {
            buffer.append(c);
        } else {
            // error, can't convert
            break;
        }
    }

    return buffer.toString();
}

private boolean isLegalCharacter(char c) {
    return c == '\'' || Character.isLetter(c);
}

This particular isLegalCharacter method is not very strong, but you can adapt it to your needs. For instance, it fails for umlaut characters, such as, e.g., in the word "naïveté".

But if you know that you will never run into such cases, the above approach might work for you.

Upvotes: 1

Sergey Kalinichenko
Sergey Kalinichenko

Reputation: 726809

You cannot do it with decimal numbers, because the number of digits in their representation changes. Because of this, you wouldn't be able to distinguish sequences 112 5 from 11 25 and 1 125.

You could force each character to occupy exactly three digits, however. In this case, you would be able to restore the number back by repeatedly dividing by 1000, and taking the remainder:

for (char c : str.toCharArray()) {
    String numStr = String.valueOf((int)c);
    while (numStr.length() != 3) numStr = "0"+numStr;
    sb.append(numStr);
}

If you use only the ASCII section of the UNICODE code points, this is somewhat wasteful, because the values that you need are for the most part two-digit. If you switch to hex, all ASCII code points would fit in two digits:

for (char c : str.toCharArray()) {
    String numStr = Integer.toString(c, 16);
    if (numStr.length() == 1) numStr = "0"+numStr;
    sb.append(numStr);
}
BigInteger mInt = new BigInteger(sb.toString(), 16);

Now you can use division by 256 instead of 1000.

Upvotes: 3

Harry
Harry

Reputation: 1472

This could be do-able if all the characters you use in the String are double-digit ASCIS. For example: "ARE" would give '658269' and you would know to treat it two digits at a time to reverse it. The problem here is that you don't now whether it is double or triple digit ASCI codes....

However, if it is purely String values [a-zA-Z], you could see whether the double digit lies in the range [65-90] or [97-99] else take the triple digit and it should be in the range [100-122]

But it goes without saying that there are better ways of doing this.

Upvotes: 2

Suresh Atta
Suresh Atta

Reputation: 122018

The answer is a big No, You cannot get it back with your existing approach.

Instead you can have an integer array (if possible). You may get best solution if you explain why you are actually doing this.

Upvotes: 2

Tim B
Tim B

Reputation: 41208

The simple answer is that you cant as you have lost data. You have no way of knowing how many digits each character had.

You need some sort of separator between the numbers.

Upvotes: 2

Related Questions