Reputation: 511
I have this pipeline where i stream data from Python and connect to the stream in a Java applicaton. The data records are matrices of complex numbers. Now I've learned that json.dumps() can't deal with pythons complex type.
For the moment I've converted the complex values to a string, put it in a dictionary like this:
for entry in range(len(data_array)):
data_as_string = [str(i) for i in data_array[entry]["DATA"].tolist()]
send({'data': data_array[entry]["DATA"],
'coords': data_array[entry]["UVW"].tolist()})
and send it to he pipeline. But this requires extensive and expensive custom deserialization in Java, which increases the running time of the pipeline by a lot. Currently I'm doing the deserialization like this:
JSONObject = new JSONOBJECT(string);
try {
data= jsonObject.getString("data");
uvw= jsonObject.getString("uvw");
} catch (JSONException ex) {
ex.printStackTrace();
}
And then I'm doing a lot of data.replace(string1, string2)
to remove some of the signs added by the serialization and then looping through the matrix to convert every number into a Java Complex type.
My Java deserialization code looks the following:
data = data.replace("(","");
data = data.replace(")","");
data = data.replace("\"","");
data = data.replace("],[","¦");
data = data.replace("[","");
data = data.replace("]","");
uvw = uvw.replace("[","");
uvw = uvw.replace("]","");
String[] frequencyArrays = data.split("¦");
Complex[][] tempData = new Complex[48][4];
for(int i=0;i< frequencyArrays.length;i++){
String[] complexNumbersOfAFrequency = frequencyArrays[i].split(", ");
for(int j =0;j<complexNumbersOfAFrequency.length;j++){
boolean realPartNegative = false;
Complex c;
if(complexNumbersOfAFrequency[j].startsWith("-")){
realPartNegative = true;
//Get ridd of the first - sign to be able to split the real & imaginary parts
complexNumbersOfAFrequency[j] =complexNumbersOfAFrequency[j].replaceFirst("-","");
}
if(complexNumbersOfAFrequency[j].contains("+")){
String[] realAndImaginary = complexNumbersOfAFrequency[j].split("\\+");
try {
double real = Double.parseDouble(realAndImaginary[0]);
double imag = Double.parseDouble(realAndImaginary[1].replace("j",""));
if(realPartNegative){
c = new Complex(-real,imag);
} else {
c = new Complex(real,imag);
}
}catch(IndexOutOfBoundsException e) {
//System.out.println("Wrongly formatted number, setting it to 0");
c = new Complex(0,0);
}
catch (NumberFormatException e){
System.out.println("Wrongly formatted number, setting it to 0");
c = new Complex(0,0);
}
} else {
String[] realAndImaginary = complexNumbersOfAFrequency[j].split("-");
try {
double real = Double.parseDouble(realAndImaginary[0]);
double imag = Double.parseDouble(realAndImaginary[1].replace("j", "").replace("e", ""));
if (realPartNegative) {
c = new Complex(-real, -imag);
} else {
c = new Complex(real, -imag);
}
}
catch(IndexOutOfBoundsException e){
System.out.println("Not correctly formatted: ");
for(int temp = 0;temp<realAndImaginary.length;temp++){
System.out.println(realAndImaginary[temp]);
}
System.out.println("Setting it to (0,0)");
c = new Complex(0,0);
}
catch (NumberFormatException e){
c = new Complex(0,0);
}
}
tempData[i][j] = c;
}
}
Now my question would be if there is a way to either
1)Deserialize the Dictionary in Java without expensive String manipulations and looping through the matrices for each record or
2)Do a better Job in serializing the data in python so that it can be done better in java
Any hints are appreciated.
Edit: JSON looks the following
{"data": ["[(1 + 2j), (3 + 4j), ...]","[(5 + 6j), ...]", ..."],
"coords": [1,2,3]}
Edit: For the coordinates I can do the deserialization in Java pretty easily:
uvw = uvw.replace("[","");
uvw = uvw.replace("]","");
String[] coords = uvw.split(",");
And then cast the Strings in coords
with Double.parseDouble()
, howver for the data string this is way more complicated because the string is full of characters that need to be removed in order to get the actual numbers and to put them in the right place in the Complex[][]
I want to cast it to
Upvotes: 0
Views: 519
Reputation: 44414
You are over-using JsonObject.getString, by using it to retrieve non-string data.
Let’s start with the coords
property, since it’s a simpler case. [1,2,3]
is not a string. It’s an array of numbers. Therefore, you should retrieve it as an array:
JsonArray coords = jsonObject.getJsonArray("coords");
int count = coords.size();
double[] uvw = new double[count];
for (int i = 0; i < count; i++) {
uvw[i] = coords.getJsonNumber(i).doubleValue();
}
The other property, data
, is also an array, but with string elements:
JsonArray data = jsonObject.getJsonArray("data");
int count = data.size();
for (int i = 0; i < count; i++) {
String complexValuesStr = data.getString(i);
// ...
}
As for parsing out the complex numbers, I wouldn’t make all those String.replace calls. Instead, you can look for each complex value with a regular expression matcher:
Pattern complexNumberPattern = Pattern.compile(
"\\(\\s*" + // opening parenthesis
"(-?[0-9.]+)" + // group 1: match real part
"\\s*([-+])\\s*" + // group 2: match sign
"([0-9.]+)j" + // group 3: match imaginary part
"\\s*\\)"); // closing parenthesis
Matcher matcher = complexNumberPattern.matcher("");
JsonArray data = jsonObject.getJsonArray("data");
int count = data.size();
List<List<Complex>> allFrequencyValues = new ArrayList<>(count);
for (int i = 0; i < count; i++) {
String complexValuesStr = data.getString(i);
List<Complex> singleFrequencyValues = new ArrayList<>();
matcher.reset(complexValuesStr);
while (matcher.find()) {
double real = Double.parseDouble(matcher.group(1));
boolean positive = matcher.group(2).equals("+");
double imaginary = Double.parseDouble(matcher.group(3));
Complex value = new Complex(real, positive ? imaginary : -imaginary);
singleFrequencyValues.add(value);
}
allFrequencyValues.add(singleFrequencyValues);
}
You should not catch IndexOutOfBoundsException or NumberFormatException. Those indicate the input was invalid. You should not treat invalid input like it’s zero; it means the sender made an error, and you should make sure to let them know it. An exception is a good way to do that.
I have made the assumption that both terms are always present in each complex expression. For instance, 2i would appear as 0 + 2j
, not just 2j
. And a real number like 5 would appear as 5 + 0j
. If that is not a safe assumption, the parsing gets more complicated.
Since you are concerned with performance, I would try the above; if the use of a regular expression makes the program too slow, you can always look for the parentheses and terms yourself, by stepping through the string. It will be more work but may provide a speed increase.
Upvotes: 1
Reputation: 88747
If I understand you correctly, your matrix would consist of arrays of complex numbers which in turn would contain a real number and an imaginary one.
If so, your data could look like this:
[[{'r':1,'j':2},{'r':3,'j':4}, ...],[{'r':5,'j':6}, ...]]
That means that you have a JSON array which contains arrays that contain objects. Those objects have 2 properties: r
defining the value of the real number and j
the value of the imaginary one.
Parsing that in Java should be straight forward, i.e. with some mapper like Jackson or Gson you'd just parse it into something like ComplexNumber[][]
where ComplexNumber
could look like this (simplified):
public class ComplexNumber {
public double r;
public double j;
}
Of course there may be already existing classes for complex numbers so you might want to use those. Additionally you might have to deserialize that manually (either because the target classes don't make it easy for the mappers or you can't/don't want to use a mapper) but in that case it would be just a matter of iterating over the JSONArray
elements and extracting r
and j
from the JSONObject
s.
Upvotes: 1