user6429297
user6429297

Reputation:

Comprising different compressing methods for JSON data in python3

So, I want to compress my JSON data using different compressor. I used this to compress the JSON.

import gzip
import JSON

with gzip.GzipFile('2.json', 'r') as isfile:
    for line in isfile:
        obj = json.loads(line)

which raises error.

raise OSError('Not a gzipped file (%r)' % magic)

OSError: Not a gzipped file (b'[\n')

I also tried direct compressing using.

zlib_data= zlib.compress(data)

which raises an error.

return lz4.block.compress(*args, **kwargs)

TypeError: a bytes-like object is required, not 'list'

So, Basically i want to compress a JSON using all the methods and to compute the time taken for the compression in different methods.

Upvotes: 3

Views: 5128

Answers (1)

JessieB
JessieB

Reputation: 177

On python2.7

it seems to be a problem of the type of your data

the data to compress should be a 'str' type

import gzip
import json
import lz4
import time

with gzip.GzipFile('data.gz','w') as fid_gz:
    with open('data.json','r') as fid_json:
        # get json as type dict
        json_dict = json.load(fid_json)
        # convert dict to str
        json_str = str(json_dict)
    # write string
    fid_gz.write(json_str)

# check well maded
with gzip.GzipFile('data.gz','r') as fid_gz :
    print(fid_gz.read())

even if gzip compression

gzip.zlib.compress(json_str,9)

even if lz4 compression

lz4.block.compress(json_str)

and time checking would be

# set start time
st = time.time()
# calculate elasped time
print(time.time() - st)

On python3.5

the difference between python2.7 and python 3 is the type of your data to compress

the data to compress should be a 'byte' type via bytes()

when making a .gz file

with gzip.GzipFile('data.gz','w') as fid_gz:
    with open('data.json','r') as fid_json:
        json_dict = json.load(fid_json)
        json_str = str(json_dict)
        # bytes(string, encoding)
        json_bytes = bytes(json_str,'utf8')
    fid_gz.write(json_bytes)

or just compress with gzip.compress(data, compresslevel=9)

# 'data' takes bytes
gzip.compress(json_bytes)

or just compress with zlib.compress(bytes, level=-1, /)

gzip.zlib.compress(json_bytes,9)

or just compress with lz4.bloc.compress(source, compression=0)

# 'source' takes both 'str' and 'byte'
lz4.block.compress(json_str)
lz4.block.compress(json_bytes)

the measuring time is on your intention.

cheers

Upvotes: 1

Related Questions