Stealthguy
Stealthguy

Reputation: 63

Extracting pixel data from given image data. Need Help Understanding the code

I've been looking into building textures from image data, however the supplied code from some tutorials deals with shifting bits in order to get the image pixels. However I'm very new to bit shifting. I understand binary & and |, but I have no clue why this code is needed to get pixel data.

Here is the texture data:

static char* g_pTextureData =
  "?VE`8U)K13Y:1C];2$%:5DQA>'&!@WB*:UQR9EET9%ES8UAP9UMZ>&%[A6Z$>6-Z"
  "S[6XR[*PIH2\"IXN0HY2>DWR#VL;+TL3,LYZ<QZVNUL\"_L)..NJJXN:>[V<#)R;\"Y"
  "W\\G,Y-/4P9^3SZ^KZ-K>RJN?S:VMY=/4U[NPW<+![-GCV<\"UX,C&Y-7BZ=K>V\\2^"
  "W\\F^XL[!R*.6UK:PYM7/RZB:U[R^V;^POY:)SZ^:Y]#/Y=')SZ^:Y]#/Y=')UKBH"
  "OY*%O9*%K()WPYF+Q)Z+JGUVU[RUO9.\"HW=MN9)\\R)N)MXQVM9!WRIN+QIJ+L8-W"
  "GG9FJGYNIGMMJ'YMK()PHWIKN9J+HX)GGGI?L8YQL8US@653F7EBJGYJJGUIEVU@"
  "GGY:F7E9E'18G'I9G'M:E'1;CW->C')6AVE3E'==F7ED<UQ5@6M4EWE5F7Y6DG92"
  "C'!6AVI5@613AVQ5B6U5@6=6?FA8?F96>V)5@6E6AVY<:UE6:UM1>V90@6E0@6A0"
  "<U]7=E]7<UE6<%M7=F!7<%M5:5=4:5M7;EI6:UM2;EY29%138E53:UA1:5A0<%]2"
  "9%E6<%U:;EE;;EA::UE79E548E%39%15:5957U)29E=58E-69EA6<%Y6<%Y6:5E5"
  "6U)3:UY<=F5A<U]?=F1>:UI:9EA7:5E78E158E-6;EQ89E=69%54:5M6<V189%E6"
  "4$I/7556;F->;F!?;EY>:UI=;EY=<&%=9%588E-59%55<&!;:5A96TY25DY/5D]0"
  "/SQ*03],1D-/2$523$)44$-65DI76TY86TQ674Y674Y49EA::5E;64Y2/SQ(/CQ("
  ",#%'+C!',#))-#-,-S--.3-./#9//#9./#1-0SM02D%013U,/SA*-S-(+BY%,3!&"
  "(R-%&Q]$&A]%'B)'(R1()\"5((B-(*2=(*\"9)+\"E)+2M))21%(\"%#'!Y#(\"%$(B)%"
  "\"A)\"\"A)\"\"1-\"\"A-\"\"A1\"\"A-\"\"Q-\"#!5\"#A5\"#A5\"#Q9\"$AA"
  "\"&QU$%QM#\"Q-\"\"A-\""
  "";

Supplied macro for getting bit data representation of color:

#define HEADER_PIXEL(data,pixel) { \
  pixel[0] = (((data[0] - 33) << 2) | ((data[1] - 33) >> 4)); \
  pixel[1] = ((((data[1] - 33) & 0xF) << 4) | ((data[2] - 33) >> 2)); \
  pixel[2] = ((((data[2] - 33) & 0x3) << 6) | ((data[3] - 33))); \
  data += 4; \
}

My understanding is that '?' would have a decimal value 63. so following the macro, 63 - 33 = 30 then shifted left by 2 bits

(00000000 00000000 00000000 00011110) << 2
(00000000 00000000 00000000 01111000) = 120

Next is V with decimal value 86 With the macro, 86 - 33 = 53 then shifted right by 4 bits

(00000000 00000000 00000000 00110101) >> 4
(00000000 00000000 00000000 00001101) = 13

Then we do a bitwise or operation

01111000
00001101
========
01111101 = 125

I understand the math behind this. But my question is why is the math needed? Why 33 and bit shifting? Also, why do we need 0xF and 0x3?

Is it decompressing the image data? Or is it doing something else?

Is this anything that I would need to ever know? Or is this just a very specific instance in that this is how we compress/decompress images?

Update, Thanks @v154c1 for helping me get this in the bag.

For anyone else who comes across this. This is how I rationalized it using what @v154c1 had demonstrated.

00rrrrrr << 2 = rrrrrr00
00rrgggg >> 4 = 000000rr
rrrrrr00 | 000000rr = rrrrrrrr

00rrgggg & 00001111 = 0000gggg << 4 = gggg0000
00ggggbb & >> 2 = 0000gggg
gggg0000 | 0000gggg = gggggggg

00ggggbb & 00000011 = 000000bb << 6 = bb000000
00bbbbbb
bb000000 | 00bbbbbb = bbbbbbbb

Upvotes: 3

Views: 640

Answers (2)

You've already received an answer to your "main question", so I'll answer the others. :-)

Is it decompressing the image data? Or is it doing something else?

What you are looking at is an encoding. Compression is one kind of encoding, but this is not compression.

Consider if I told you that you needed to send me a number between 4 and 9. But what if I told you that you could only do it with the symbol set of A, B, C. If you were just to make something up off the top of your head, you might pick:

 4 => A
 5 => B
 6 => C
 7 => AA
 8 => BB
 9 => CC

There could be other encodings of course. Some might be simpler to read and understand, perhaps being "wasteful":

 4 => AAAA
 5 => AAAAA
 6 => AAAAAA
 7 => AAAAAAA
 8 => AAAAAAAA
 9 => AAAAAAAAA

It is still only 6 distinct values "in the abstract", whether you look at the left or if you look at the right in either case. The same amount of data is preserved. But you need only look at your screen to see that if we are to measure it in a count of characters instead of states it has gotten longer.

So that is what is happening in this case with the image data. You are taking three bytes of pixel data, and storing it in four byte-sized characters from a limited set. Because both a pixel and a character are occupying a byte, it could be thought of as "getting bigger".

(But again, in a way, it actually isn't. :-P)

Is this anything that I would need to ever know? Or is this just a very specific instance in that this is how we compress/decompress images?

I'd say what you ever "need" to know depends entirely on what you ever "want" to accomplish. :-)

But if you want to be a programmer, understanding the abstract point about encodings is important. People can spend entire careers in software without touching C (and some might say you could very well be better off for it). Yet the concepts about encoding I mention in this answer will arise no matter what you program in.

Upvotes: 0

v154c1
v154c1

Reputation: 1698

The answer linked by Roger Rowland (Explanation of Header Pixel in GIMP created C Header File of an XPM image) is actually explaining it pretty nicely. They store RGB values (24bits) in 4 printable characters.

The magic value 33 is first printable character they use (! in ASCII).

So the process done by GIMP is:

At first, you have 1 pixel with 3 8bit values for R, G and B. You can image it like this:

rrrrrrrr gggggggg bbbbbbbb

But you can't simply dump this into a header file. So you split it into groups by 6 bits: (values 0 - 63):

rrrrrr rrgggg ggggbb bbbbbb

then add number 33 to every group (so the values are 33 - 96.) and then store it into the header file as 4 characters.

In order to decode it back to pixel data, you simply substract 33 to get the original 6 bit values and them combine bits into 3 8bit values again.

This shifts and masks (&) are simply to combine the bits together.

For example, take the fist one:

pixel[0] = (((data[0] - 33) << 2) | ((data[1] - 33) >> 4));

data[0] and data[1] are first and second character (with that 33 added). SO you substract it (data[0] - 33) and get:

data[0] - 33 = rrrrrr
data[1] - 33 = rrgggg

then the shifts push the values into the right places:

rrrrrr << 2  = rrrrrr00
rrgggg >> 4  =       rr

When you add it together, you have original value rrrrrrrr.

The values 33 to 96 maps to characters:

!, ", #, $, %, &, \', (, ), *, +, ,, -, ., /, 0, 1, 2, 3, 4, 5, 6, 7, 8,
9, :, ;, <, =, >, ?, @, A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P,
Q, R, S, T, U, V, W, X, Y, Z, [, \\, ], ^, _, `

Upvotes: 6

Related Questions