Reputation: 300
This program uses sockets to transfer highly redundant 2D byte arrays (image like). While the transfer rate is comparatively high (10 Mbps), the arrays are also highly redundant (e.g. Each row may contain several consequently similar values). I have tried zlib and lz4 and the results were promising, however I still think of a better compression method and please remember that it should be relatively fast as in lz4. Any suggestions?
Upvotes: 6
Views: 695
Reputation: 1260
you could create your own, if the data in rows is similar you can create a resource / index map thus reducing substantial the size, something like this
Original file:
row 1: 1212, 34,45,1212,45,34,56,45,56
row 2: 34,45,1212,78,54,87,....
you could create a list of unique values, than use and index in replacement,
34,45,54,56,78,87,1212
row 1: 6,0,2,6,1,0,.....
this can potantialy save you over 30% or more data transfer, but it depends on how redundant the data is
UPDATE
Here a simple implementation
std::set<int> uniqueValues
DataTable my2dData; //assuming 2d vector implementation
std::string indexMap;
std::string fileCompressed = "";
int Find(int value){
for(int i = 0; i < uniqueValues.size; ++i){
if(uniqueValues[i] == value) return i;
}
return -1;
}
//create list of unique values
for(int i = 0; i < my2dData.size; ++i){
for(int j = 0; j < my2dData[i].size; ++j){
uniqueValues.insert(my2dData[i][j]);
}
}
//create indexes
for(int i = 0; i < my2dData.size; ++i){
std::string tmpRow = "";
for(int j = 0; j < my2dData[i].size; ++j){
if(tmpRow == ""){
tmpRow = Find(my2dData[i][j]);
}
else{
tmpRow += "," + Find(my2dData[i][j]);
}
}
tmpRow += "\n\r";
indexMap += tmpRow;
}
//create file to transfer
for(int k = 0; k < uniqueValues.size; ++k){
if(fileCompressed == ""){
fileCompressed = "i: " + uniqueValues[k];
}
else{
fileCompressed += "," + uniqueValues[k];
}
}
fileCompressed += "\n\r\d:" + indexMap;
now on the receiving end you just do the opposite, if the line start with "i" you get the index, if it start with "d" you get the data
Upvotes: 2
Reputation: 112502
You should look at the PNG algorithms for filtering image data before compressing. They are simple to more sophisticated methods for predicting values in a 2D array based on previous values. To the extent that the predictions are good, the filtering can make for dramatic improvements in the subsequent compression step.
You should simply try these filters on your data, and then feed it to lz4.
Upvotes: 4