Reputation: 300
There is a task to save data in some local storage or for network transmission. Data are stored in plain object form (key-value pairs). With the aim to save bandwith and storage space I'm planning to use the conversion from long verbose key names to corresponding digits (using map) and restore them on receiving data. For example:
var statesMap = {
SEARCH: 0,
SORT: 1,
FILTER: 2,
DISPLAY_FAV: 3,
PANEL_POS: 4,
MENU_LIST_POS: 5,
MAIN_LIST_POS: 6,
INFO_LIST_POS: 7,
CHANNEL: 8
};
config = {
APP: ['SEARCH', 'SORT', 'DISPLAY_FAV', 'PANEL_POS', 'MENU_LIST_POS', 'MAIN_LIST_POS', 'CHANNEL']
};
app.state = {};
// restore state from {"0":"ui","1":"NAME","3":false,"4":0,"5":3,"6":0}
config[appName.toUpperCase()].forEach(function ( state ) {
var value = storage[statesMap[state]]; // 'storage' stores my data
// convert "compressed" properties to full equivalents
app.state[state] = value != null ? value : '';
});
// result {"SEARCH":"ui","SORT":"NAME","DISPLAY_FAV":false,"PANEL_POS":0,"MENU_LIST_POS":3,"MAIN_LIST_POS":0,"CHANNEL":""}
// store data
var tmp = {};
// filter fields with empty values, don't store them
Object.keys(app.state).filter(function ( key ) { return !!String(app.state[key]); }).forEach(function ( state ) {
tmp[statesMap[state]] = app.state[state];
});
storage = tmp;
Are there sufficient benefits and advantages with this approach? Are there better optimizations? Does this optimization interfere with gzip compression algorithm?
Thanks a lot.
Upvotes: 1
Views: 355
Reputation: 4241
Here is my take on your question.
If you use your idea, don't use gzip or use it when the data is longer than x bytes.
The best is to do not use it.
This is why I say to do not use your method:
Using your method has the following advantages:
But, you have to consider this:
One example is, if in the future, you want to add debugging information or fields that you wont always return. Here's a very basic example:
{"error":false,[... data goes here ...]}
{"error":true,"type":"..."}
{"error":500,"desc":"Service unavaliable"}
With your code, the 2nd field can either be "desc"
, "type"
or anything.
You could say: "We can add the desc
and type
fields and always send them!". And now, you are adding more data, defeating the reason to use your method.
Backing up my claims
What kind of answer would mine be without some data do backup bold statements like "impossible to compress"?
Lets consider your example data:
{"SEARCH":"ui","SORT":"NAME","DISPLAY_FAV":false,"PANEL_POS":0,"MENU_LIST_POS":3,"MAIN_LIST_POS":0,"CHANNEL":""}
Which you (non-optimally) reduced to this:
{"0":"ui","1":"NAME","3":false,"4":0,"5":3,"6":0}
All the compression values will be using this base PHP code:
$output = '<content>';
echo 'gzip:', strlen(gzencode($output)), ' original:', strlen($output);
Which shows the gzipped size and the original size, side by side. I will keep the code as simple as possible, to avoid alienating non-php developers.
Here's the results I've gotten from different executions:
{"SEARCH":"ui","SORT":"NAME","DISPLAY_FAV":false,"PANEL_POS":0,"MENU_LIST_POS":3,"MAIN_LIST_POS":0,"CHANNEL":""}
gzip:112 original:112
(+0 bytes){"0":"ui","1":"NAME","3":false,"4":0,"5":3,"6":0}
gzip:65 original:49
(+16 bytes over the original)["ui","NAME",false,0,3,0]
(proper JSON output, based on your object)gzip:45 original:25
(+20 bytes over the original)You can try the code on http://sandbox.onlinephpfunctions.com/code/d8d0799147e3256ede2b730cb1ded7cf66c1eb67
The output was obtained from comments in the question, and not made up by me. Except the last one, which is based on the 2nd output. The 2nd output shows an array being represented as an object (sequencial numeric keys).
Okay, I said many words, showed some code and what not. My conclusion?
For your case, either use gzip or use your method. I would stick to plain and simple gzip, and call it a day.
You won't be needlessly increasing your output size.
Even then, if you really want your method, disable gzip and save CPU cicles on your server.
Upvotes: 0
Reputation: 64905
The optimization you are referring to might be called "token replacement" or something like that and is a reasonable approach to domain-specific compression.
This type of transformation doesn't prevent matching+entropy based algorithms like gzip from working, and so you aren't likely to get a larger final size after applying this transformation. That said, the replacement you are doing is exactly the type of thing that gzip is good at doing, so doing it yourself before invoking gzip may be a bit redundant.
To know for sure, you can simply test! What are the typical results of your token replacement + gzip, versus gzip alone?
Even without testing, here are some advantages and disadvantages of the token replacement-before-gzip based approach:
Basically, I would recommend against it, unless your testing shows that it provides a significant performance boost. Usually it won't, since gzip is already removing most of the redundancy, but it depends on the specifics of your situation.
Upvotes: 3