Antoine
Antoine

Reputation: 627

numpy.unique changes NULL character to empty character

EDIT: I rephrased the question in terms of bytes rather than characters. I am doing a frequency analysis on ciphertext.


When I use numpy.unique() on a list of bytes, the NULL byte b'\x00' ends up being the empty character b'' .

The following minimal example

import numpy as np

byte_list = [b'\x00', b'1']
freq = {byte: count for (byte, count) in zip(*np.unique(byte_list, return_counts=True))}
freq

returns

{b'': 1, b'1': 1}

while I expect

{b'\x00': 1, b'1': 1}

Why is that?

Python version 3.7.4.
Numpy version 1.17.2.

Upvotes: 0

Views: 134

Answers (1)

CodeRed
CodeRed

Reputation: 903

If this is your expected output:

{b'0': 2, b'1': 1, b'\\': 1, b'x': 1}

You forgot about the backslash(\) that escapes the next character in your bs variable. In this case bs will be:

bs = b'\\x001'

Upvotes: 1

Related Questions