Reputation: 163
I'm making a code to encode the 10MB Binary Data using Reed-Solomon Code.
But, the module throws the error about message length like the following warning.
ValueError: Message is too long (10032003 when max is 255)
I've tried to understand the library's codes though, I couldn't understand the purpose of the codes.
Can you help me handle this problem?
This is the Reed-Solomon Module when I make the following codes.
The following codes is some parts of the code that I've made.
import time
import reedsolo as rs
def encoding(per, msg, n, nsym, gen):
time = 0
count = 0
rs.init_tables(0x11d)
while time < per:
temp = time.time()
rs.rs_encode_msg(msg, nsym, gen=gen[nsym])
time += time.time() - temp
count += 1
def main():
data = b"<SOME TEXT>"*5500 #This data size is 10MB
n = 8
nsym = 3 # I wanted RS(8,3)
period = 10
gen = rs._rs_generator_poly_all(n)
encoding(period, data, 8, 3, gen)
Upvotes: 0
Views: 2028
Reputation: 28826
I would suggest a method similar to jerasure as used for cloud storage. Treat the data as a matrix, 5 rows of data, 3 rows of ECC (RS(8,5) 8 total, 5 data, 3 parties), where each row has ncol = 10MB/5 (ncol is number of columns). Use RS code on each column of data independently. You might want to consider more rows, like 16 rows of data, 4 rows of ECC (RS(20,16)).
The error detection + correction process will need to convert columns into an array and call the RS library for encode and decode, and convert the results back to columns.
You'll need to use a compiled library as opposed to a python based library in order for the code to run fast enough. A compiled library for X86 can be sped up by using an assembly module that uses the PSHUFB (xmm registers) instruction to process 16 bytes at a time in parallel.
The github library linked to in the question includes a C source file that is meant to be compiled, but I don't know what tools are required to create a C compiled library for python.
Upvotes: 2