Reputation: 647
return
, stops before error:from enum import Enum
class Codes(Enum):
CLEAR = 0
EOI = 1
def decode(code_bytes: bytes):
"""Return decoded gif from incoming bytes, currently only works with entire bytes object"""
lzw_min, leng, *_ = code_bytes
total = 2 + leng
# Skip lzw_min and sub-block size indicator bytes
skips = {0, 1, total}
while leng != 1:
leng = code_bytes[total] + 1
total += leng
skips |= {total}
def _stream(skips):
"""Return least significant bit still available from current byte"""
for i, byte in enumerate(code_bytes):
if i in skips:
continue
for _ in range(8):
yield byte & 1 and 1
byte >>= 1
code_stream = _stream(skips)
def get_code(bits, x=0, s=0):
"""Retrieve bits and assemble in proper order"""
for n in range(s, bits):
x |= next(code_stream) << n
return x
code_table = {
**{i: [i] for i in range(2 ** lzw_min)},
2 ** lzw_min: [Codes.CLEAR],
2 ** lzw_min + 1: [Codes.EOI]
}
bits = lzw_min + 1
assert Codes.CLEAR in code_table[get_code(bits)] # First code is always CLEAR
table_index = init_table = len(code_table) - 1
last = get_code(bits)
code = get_code(bits)
yield last
while Codes.EOI not in code_table.get(code, []):
if code <= table_index:
for i in code_table[code]:
yield i
table_index += 1
code_table[table_index] = code_table[last] + [code_table[code][0]]
else:
k = code_table[last][0]
for i in code_table[last] + [k]:
yield i
table_index += 1
code_table[table_index] = code_table[last] + [k]
if table_index == 2**bits-1:
bits += 1
# Problem replaced here
if bits == 13:
return
last, code = code, get_code(bits)
if bits == 13:
last, code = code, get_code(bits)
bits = lzw_min + 1
table_index = init_table
assert Codes.CLEAR in code_table[code]
While reading the image, when table_index
reaches one under the max bitlength and increments bits past the max bitlength, this code executes. It apparently isn't functioning properly.
After reaching max bit size, a CLEAR
code is read next, as expected, no AssertionError is raised.
From what I've read, a CLEAR
code means reset bitlength to lzw_min + 1
and re-initialize the code table. I believe that's what my code does.
Originally, code_table was a list and was reset after the CLEAR
and bitlength reset, but codes much larger than the current table index kept being output from the stream. making code table a dict allows currently un-overwritten codes to be accessed after re-initializing, but this just produces noise. Apparently, some of the noise eventually registers as EOI
and parsing ends.
CLEAR
?Here are the second frame of the test gif as a bytes literal you may use for function input and the frame's palette as a list literal for building the img:
bytes: (removed from pastebin)
palette:
[(230, 226, 225), (54, 28, 25), (99, 25, 28), (117, 22, 28), (65, 39, 33), (79, 45, 38), (92, 39, 36), (81, 45, 39), (79, 48, 40), (88, 50, 43), (100, 42, 38), (104, 60, 50), (111, 66, 55), (119, 68, 57), (138, 11, 23), (134, 14, 24), (139, 14, 25), (151, 9, 23), (148, 13, 26), (156, 12, 26), (132, 18, 27), (141, 18, 29), (149, 19, 30), (166, 7, 23), (173, 7, 24), (164, 11, 26), (172, 11, 27), (184, 4, 23), (181, 7, 25), (190, 6, 25), (180, 10, 27), (191, 9, 28), (166, 16, 30), (142, 22, 33), (142, 25, 35), (151, 23, 35), (146, 27, 37), (147, 30, 41), (189, 14, 33), (169, 23, 37), (180, 20, 37), (180, 23, 39), (188, 18, 36), (188, 26, 43), (149, 35, 45), (155, 36, 42), (151, 41, 51), (153, 43, 53), (134, 63, 56), (151, 48, 57), (156, 51, 60), (166, 40, 52), (173, 42, 51), (184, 39, 54), (167, 53, 55), (192, 7, 26), (192, 10, 29), (193, 14, 33), (194, 19, 37), (196, 26, 39), (195, 22, 40), (195, 27, 44), (196, 31, 48), (199, 34, 45), (197, 36, 52), (201, 42, 53), (198, 42, 58), (200, 45, 60), (202, 51, 60), (131, 75, 63), (175, 67, 61), (159, 55, 64), (157, 59, 67), (161, 59, 68), (170, 58, 69), (185, 55, 68), (202, 52, 67), (199, 56, 70), (204, 60, 75), (208, 62, 64), (148, 81, 71), (164, 68, 76), (171, 69, 76), (183, 70, 73), (189, 86, 75), (165, 75, 83), (169, 76, 84), (184, 75, 85), (170, 90, 83), (168, 84, 91), (172, 84, 91), (169, 99, 82), (177, 104, 86), (182, 106, 88), (172, 90, 97), (181, 92, 99), (173, 102, 108), (181, 100, 107), (183, 109, 114), (186, 118, 123), (198, 67, 72), (205, 65, 78), (206, 77, 73), (206, 69, 82), (202, 74, 87), (207, 73, 86), (208, 74, 87), (208, 77, 90), (199, 85, 90), (215, 91, 82), (209, 81, 94), (199, 104, 87), (218, 106, 92), (195, 114, 93), (203, 114, 92), (212, 112, 95), (201, 90, 100), (210, 85, 97), (211, 90, 101), (212, 94, 105), (199, 99, 108), (213, 98, 109), (202, 119, 99), (212, 122, 99), (215, 121, 100), (217, 125, 102), (201, 105, 114), (214, 102, 113), (215, 106, 117), (215, 109, 119), (216, 111, 120), (199, 116, 123), (212, 115, 124), (217, 115, 124), (225, 121, 103), (221, 130, 107), (226, 133, 110), (228, 135, 112), (230, 136, 113), (232, 138, 114), (224, 140, 121), (184, 125, 129), (198, 124, 130), (215, 122, 130), (219, 123, 132), (185, 133, 137), (189, 143, 146), (190, 149, 152), (191, 160, 162), (198, 132, 137), (215, 135, 141), (221, 131, 139), (201, 141, 145), (212, 141, 146), (222, 139, 146), (203, 150, 154), (214, 150, 154), (226, 153, 135), (224, 142, 149), (224, 144, 151), (225, 148, 155), (226, 153, 159), (224, 169, 156), (200, 158, 161), (214, 158, 162), (227, 156, 162), (200, 167, 169), (210, 162, 165), (216, 166, 169), (213, 170, 173), (201, 175, 177), (216, 174, 176), (205, 184, 185), (218, 179, 180), (215, 182, 184), (221, 187, 188), (228, 161, 166), (229, 165, 170), (230, 170, 174), (231, 173, 177), (232, 174, 178), (232, 178, 181), (227, 183, 185), (233, 182, 185), (234, 186, 189), (212, 191, 192), (228, 191, 193), (235, 190, 193), (207, 203, 202), (211, 195, 196), (212, 205, 204), (218, 201, 201), (212, 208, 207), (214, 210, 210), (216, 212, 212), (219, 216, 215), (222, 218, 217), (226, 195, 196), (236, 195, 197), (228, 199, 200), (237, 199, 200), (229, 203, 203), (238, 203, 204), (238, 207, 208), (240, 207, 208), (230, 213, 213), (235, 212, 212), (226, 221, 220), (237, 220, 219), (241, 211, 212), (241, 215, 216), (242, 219, 219), (228, 224, 223), (239, 224, 223), (242, 224, 223), (230, 226, 224), (234, 229, 228), (237, 232, 231), (239, 234, 232), (244, 228, 227), (245, 232, 231), (244, 237, 235), (248, 239, 237), (246, 241, 239), (248, 241, 239), (247, 242, 240), (248, 242, 240), (250, 248, 246), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0), (0, 0, 0)]
Here is the code needed to produce the same outputs as me with these literals:
from PIL import Image
w, h = 380, 407
gifo = Image.new('RGB', (w, h))
gif = gifo.load()
pix = decode(FRAME_2) # bytes literal as input
palette = PALETTE # palette literal as input
try:
for y in range(h):
for x in range(w):
t = next(pix)
while t in {Codes.EOI, Codes.CLEAR}: # Only needed to handle buggy noise with codes mixed into it
t = next(pix)
gif[x, y] = palette[t]
except StopIteration:
...
gifo.show()
http://giflib.sourceforge.net/whatsinagif/lzw_image_data.html
https://www.daubnet.com/en/file-format-gif
https://www.w3.org/Graphics/GIF/spec-gif89a.txt
https://www.eecis.udel.edu/~amer/CISC651/lzw.and.gif.explained.html
Regarding deferred CLEAR
codes, though this gif does not contain these and I did not include logic for them in this question:
https://halicery.com/Image%20Decoders/GIF/GIF-LZW%20Deferred%20Clear%20Code.html
Upvotes: 1
Views: 420
Reputation: 647
I figured it out. Among other things, I was pulling a 13 bit CLEAR
code instead of 12 bit and I didn't cycle to the next data when initializing a table properly. This version works great.
BYTE = 8
MAX_BITLEN = 12
def decode(code_bytes):
"""Yield decoded gif-style LZW from incoming bytes."""
bytes_iter = iter(code_bytes)
lzw_min = next(bytes_iter)
get_code = _get_code_getter(bytes_iter)
code = clear = get_code(lzw_min + 1)
eoi = clear + 1
while code != eoi:
if code == clear:
bitlength = lzw_min + 1
code_table = [[i] for i in range(eoi + 1)]
last, code = get_code(bitlength), get_code(bitlength)
yield last
if code < len(code_table):
output = code_table[code]
to_add = code_table[last] + [code_table[code][0]]
else:
to_add = output = code_table[last] + [code_table[last][0]]
for i in output:
yield i
code_table += [to_add]
bitlength += len(code_table) == (bitlength < MAX_BITLEN) << bitlength
last, code = code, get_code(bitlength)
def _get_code_getter(code_iter):
def bit_stream():
"""Yield least significant bit remaining in current byte."""
length = next(code_iter)
for read, byte in enumerate(code_iter):
if read == length:
length += byte + 1
else:
for bit in (1 << b & byte and 1 for b in range(BYTE)):
yield bit
def code_getter(bitlength):
"""Retrieve/structure bits, least significant first."""
return sum(next(code_stream) << z for z in range(bitlength))
code_stream = bit_stream()
return code_getter
Upvotes: 2