Reputation: 19444
How would you store/represent a String into a long? Then jam it into an 8 byte array?
Things I've tried/working with
String eidString = "Awesome!";
ByteBuffer buf = ByteBuffer.allocate(8);
CharBuffer cbuf = buf.asCharBuffer();
cbuf.put(eidString);
byte[] eid = ByteBuffer.allocate(8).putLong(cbuf ??);
Attempt 2
Long l = Long.valueOf("Awesome!");
byte[] eid = ByteBuffer.allocate(8).putLong(l).array();
long p = ByteBuffer.wrap(eid).getLong();
System.out.println(p);
Attemp 3
String input = "hello long world";
byte[] bytes = input.getBytes();
LongBuffer tmpBuf = ByteBuffer.wrap(bytes).asLongBuffer();
long[] lArr = new long[tmpBuf.remaining()];
for (int i = 0; i < lArr.length; i++)
lArr[i] = tmpBuf.get();
System.out.println(input);
System.out.println(Arrays.toString(lArr));
// store longs...
// ...load longs
long[] longs = { 7522537965568945263L, 7955362964116237412L };
byte[] inputBytes = new byte[longs.length * 8];
ByteBuffer bbuf = ByteBuffer.wrap(inputBytes);
for (long l : longs)
bbuf.putLong(l);
System.out.println(new String(inputBytes));
Upvotes: 2
Views: 4226
Reputation: 5488
If you accept the limited character set of:
a,b,c,d,e,f,g,h,i,j,k,l,m,n,o,p,q,r,s,t,u,v,w,x,y,z,
A,B,C,D,E,F,G,H,I,J,K,L,M,N,O,P,Q,R,S,T,U,V,W,X,Y,Z,
0,1,2,3,4,5,6,7,8,9, , <-- space character
then you will have 63 symbols in your reduced alphabet R
. Each symbol can be mapped to a 6 bit representation (64 combinations unique combination of bits). There is an implicit 64th symbol which is the empty symbol used to mark the termination of the string, which should be represented by 0x00
. This leaves us with an alphabet R
that maps to 6 bits as a bijection.
A long
has 64 bits of information. Since 64/6 = 10
, this means that we can store a string with a length of up to 10 characters from alphabet R
in a Java long
variable.
It is worth noting that even though R
is a reduced alphabet and we have a string length limit of 10, we can still express meaningful English phrases, converted them to a long
, and back again!
Java Code (64 bit mapping):
public static long stringToLong(String s) {
if(s.length() >10) { throw new IllegalArgumentException("String is too long: "+s); }
long out = 0L;
for(int i=0; i<s.length(); ++i) {
long m = reducedMapping(s.codePointAt(i));
if (m==-1) { throw new IllegalArgumentException("Unmapped Character in String: "+s); }
m <<= ((9-i)*6)+4;
out |= m;
}
return out;
}
public static String longToString(long l) {
String out = "";
long m = 0xFC00000000000000L;
for(int i=0; i<10; ++i,m>>>=6) {
int x =(int)( (l&m) >>> (((9-i)*6)+4));
if(x==0) { break; }
out += mapping[x];
}
return out;
}
public static long reducedMapping(int x) {
long out=-1;
if(x >= 97 && x <= 122) { out = (long)(x-96); } // 'a' => 1 : 0x01
else if(x >= 65 && x <= 90) { out = (long)(x-37); } // 'A' => 27 : 0x1B
else if(x >= 48 && x <= 57) { out = (long)(x-+5); } // '0' => 53 : 0x35
else if(x == 32 ) { out = 63L; } // ' ' => 63 : 0x3F
return out;
}
public static char[] mapping = {
'\n', //<-- unused/empty character
'a','b','c','d','e','f','g','h','i','j','k','l','m',
'n','o','p','q','r','s','t','u','v','w','x','y','z',
'A','B','C','D','E','F','G','H','I','J','K','L','M',
'N','O','P','Q','R','S','T','U','V','W','X','Y','Z',
'0','1','2','3','4','5','6','7','8','9',' '
};
Java Code (32 bit mapping):
public static long stringToLong32(String s) {
if(s.length() >12) { throw new IllegalArgumentException("String is too long: "+s); }
long out = 0L;
for(int i=0; i<s.length(); ++i) {
long m = reducedMapping32(s.codePointAt(i));
if (m==-1) { throw new IllegalArgumentException("Unmapped Character in String: "+s); }
m <<= ((11-i)*5)+4;
out |= m;
}
return out;
}
public static String longToString32(long l) {
String out = "";
long m = 0xF800000000000000L;
for(int i=0; i<12; ++i,m>>>=5) {
int x =(int)( (l&m) >>> (((11-i)*5)+4));
if(x==0) { break; }
out += mapping32[x];
}
return out;
}
public static long reducedMapping32(int x) {
long out=-1;
if(x >= 97 && x <= 122) { out = (long)(x-96); } // 'a' => 1 : 0x01
else if(x >= 65 && x <= 90) { out = (long)(x-64); } // 'A' => 1 : 0x01
else if(x >= 32 && x <= 34) { out = (long)(x-5); } // ' ','!','"' => 27,28,29
else if(x == 44 ) { out = 30L; } // ',' => 30 : 0x1E
else if(x == 46 ) { out = 31L; } // '.' => 31 : 0x1F
return out;
}
public static char[] mapping32 = {
'\n', //<-- unused/empty character
'a','b','c','d','e','f','g','h','i','j','k','l','m',
'n','o','p','q','r','s','t','u','v','w','x','y','z',
' ','!','"',',','.'
};
}
EDIT:
Use this class for a more generalized conversions. It allows Strings
of any non empty set of characters to be uniquely mapped uniquely to a long
and converted back to the original String
again. Simply define any character set (char[]
) through the constructor and use the resulting object to convert back and forth using str2long
& long2str
.
Java StringAndLongConverter
Class:
import java.util.Arrays;
import java.util.regex.Pattern;
public class StringAndLongConveter {
/* .,.,.,.,.,.,.,.,.,.,.,.,.,.,.,.,., */
/* String <--> long ID Conversion */
/* `'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`'`' */
/* --<[ Class Members ]>-- */
// Don't re-arrange, order-dependent initializations
private final char[] CHAR_MAP;
public final int NUM_ACTUAL_CHARS;
public final int NUM_MAPPED_CHARS;
public final int BIT_COUNT;
public final int MAX_NUM_CHARS;
public final long ROLLING_MASK;
public final long FORMAT_MASK;
public final long MIN_VALUE;
public final long MAX_VALUE;
public final Pattern REGEX_CHAR_VALIDATOR;
public StringAndLongConveter(char[] chars) {
if(chars == null ) { throw new IllegalArgumentException("Cannot Pass in null reference"); }
if(chars.length==0) { throw new IllegalArgumentException("Cannot Pass in empty set" ); }
CHAR_MAP = setCharMap(chars);
NUM_ACTUAL_CHARS = CHAR_MAP.length;
NUM_MAPPED_CHARS = NUM_ACTUAL_CHARS+1;
BIT_COUNT = calcMinBitsNeeded();
MAX_NUM_CHARS = calcMaxPossibleChars();
ROLLING_MASK = calcRollingMask();
FORMAT_MASK = calcFormatMask();
MIN_VALUE = calcIDMinVal();
MAX_VALUE = calcIDMaxVal();
REGEX_CHAR_VALIDATOR = createRegExValidator();
}
/* --<[ Dynamic Initialization Calculation Helper Methods ]>-- */
//Remove duplicates
private final char[] setCharMap(final char[] chars) {
char[] tmp = new char[chars.length];
int dupes = 0;
for(int i=0; i<chars.length; ++i) {
boolean dupeFound = false;
for(int j=0; !dupeFound && j<i; ++j) {
if(chars[i]==chars[j]) {
++dupes;
dupeFound = true;
}
}
if(!dupeFound) { tmp[i-dupes] = chars[i]; }
}
char[] out = new char[chars.length-dupes];
if(dupes==0) { out = chars; }
else {
for(int i=0; i<out.length; ++i) out[i] = tmp[i];
}
return out;
}
// calculate minimum bits necessary to encode characters uniquely
private final int calcMinBitsNeeded() {
if(NUM_MAPPED_CHARS==0) { return 0; }
int val,tmp,log;
val = NUM_MAPPED_CHARS;
tmp = Integer.highestOneBit(val); // returns only the highest set bit
tmp = tmp | (tmp-1); // propagate left bits
log = Integer.bitCount(tmp); // count bits (logarithm base 2)
return ((val&(val-1))==0) ? log-1 : log;
//return one less then bit count if even power of two
}
//Calculate maximum number of characters that can be encoded in long
private final int calcMaxPossibleChars() {
return Long.SIZE/BIT_COUNT;
}
//Calculate rolling mask for str <--> long conversion loops
private final long calcRollingMask() {
long mask = 0x0000000000000001L;
for(int i=1; i<BIT_COUNT; ++i) { mask |= mask << 1; }
for(int i=1; i<MAX_NUM_CHARS; ++i) { mask <<= BIT_COUNT; }
return mask;
}
//Calculate format mask for long input format validation
private final long calcFormatMask() {
//propagate lest significant set bit in rolling mask & negate resulting value
return ~(ROLLING_MASK | (ROLLING_MASK-1));
}
//Calculate min value of long encoding
//doubles as format specification for unused bits
private final long calcIDMinVal() {
return 0xAAAAAAAAAAAAAAAAL & FORMAT_MASK;
}
//Calculate max value of long encoding
private final long calcIDMaxVal(){
char maxChar = CHAR_MAP[CHAR_MAP.length-1];
char[] maxCharArr = new char[MAX_NUM_CHARS];
Arrays.fill(maxCharArr, maxChar);
return str2long(new String(maxCharArr));
}
//Dynamically create RegEx validation string for invalid characters
private final Pattern createRegExValidator() {
return Pattern.compile("^["+Pattern.quote(new String(CHAR_MAP))+"]+?$");
}
/* --<[ Internal Helper Methods ]>-- */
private static boolean ulongLessThen(long lh, long rh) {
return (((lh ^ rh) >> 63) == 0) ? lh < rh : (0x8000000000000000L & lh)==0;
}
private long charMapping(final char c) {
for(int i=0; i<CHAR_MAP.length; ++i)
if(CHAR_MAP[i]==c)
return i+1;
return -1;
}
/* --<[ String <--> long Conversion Methods ]>-- */
public final String long2str(final long n) {
String out = "";
if (ulongLessThen(n,MIN_VALUE) || ulongLessThen(MAX_VALUE,n)) { throw new IllegalArgumentException("Long Outside of Formatted Range: "+Long.toHexString(n)); }
if ((FORMAT_MASK & n) != MIN_VALUE) { throw new IllegalArgumentException("Improperly Formatted long"); }
long m = ROLLING_MASK;
for(int i=0; i<MAX_NUM_CHARS; ++i,m>>>=BIT_COUNT) {
int x =(int)( (n&m) >>> ((MAX_NUM_CHARS-i-1)*BIT_COUNT));//10|10 0111
if(x >= NUM_MAPPED_CHARS) { throw new IllegalArgumentException("Invalid Formatted bit mapping: \nlong="+Long.toHexString(n)+"\n masked="+Long.toHexString(n&m)+"\n i="+i+" x="+x); }
if(x==0) { break; }
out += CHAR_MAP[x-1];
}
return out;
}
public final long str2long(String str) {
if(str.length() > MAX_NUM_CHARS) { throw new IllegalArgumentException("String is too long: "+str); }
long out = MIN_VALUE;
for(int i=0; i<str.length(); ++i) {
long m = charMapping(str.charAt(i));
if (m==-1) { throw new IllegalArgumentException("Unmapped Character in String: "+str); }
m <<= ((MAX_NUM_CHARS-i-1)*BIT_COUNT);
out += m; // += is more destructive then |= allowing errors to be more readily detected
}
return out;
}
public final boolean isValidString(String str) {
return str != null && !str.equals("") //null or empty String
&& str.length() <= MAX_NUM_CHARS //too long
&& REGEX_CHAR_VALIDATOR.matcher(str).matches(); //only valid chars in string
}
public final char[] getMappedChars() { return Arrays.copyOf(CHAR_MAP,CHAR_MAP.length); }
}
Have fun encoding & decoding String
s & long
s
Upvotes: 2
Reputation: 794
You don't need to do that. String has a method called getBytes(). It do the convert for you directly. Call the following method with parameter "Hallelujah"
public static void strToLong(String s) throws IOException
{
byte[] bArr = s.getBytes();
for( byte b : bArr)
{
System.out.print(" " + b);
}
System.out.println();
System.out.write(bArr);
}
The result is
72 97 108 108 101 108 117 106 97 104
Hallelujah
Upvotes: 0
Reputation: 533530
You need to encode your string as a number and reverse it.
Once you have done this you can encode the symbols in the same way you would parse a 10 or 36 base number. To turn back into a String you can do the reverse (like printing a base 10 or 36 number)
What is the range of possible characters/symbols? (you need to include a terminating symbol as the Strings can vary in length)
Upvotes: 4
Reputation: 12123
To parse a String into a Long, can use the Long wrapper class.
String myString = "1500000";
Long myLong = Long.parseLong(myString);
To stuff it into an 8-byte array...
long value = myLong.longValue();
byte[] bytes = new byte[8];
for (int i = 0; i < bytes.length; i++) {
long mask = 0xFF00000000000000 >> (i * 8);
bytes[i] = (byte) (value & mask);
}
This example is big endian.
If you're encoding a String into a long, then you can do something like:
String myString = "HELLO";
long toLong = 0;
for (int i = 0; i < myString.length(); i++) {
long c = (long) myString.charAt(i);
int shift = (myString.length() - 1 - i) * 8;
toLong += c << shift;
}
This hasn't been tested. There might be a few things wrong with it.
Upvotes: 1