clh2007
clh2007

Reputation: 107

Generate random strings in pandas

I would like to create a string of one million keys with 200 different values:

N = 1000000 
uniques_keys = [pd.core.common.rands(3) for i in range(200)] 
keys = [random.choice(uniques_keys) for i in range(N)] 

However, I get the following error

In [250]:import pandas as pd 

In [251]:pd.core.common.rands(3)
Traceback (most recent call last):

  File "<ipython-input-251-31d12e0a07e7>", line 1, in <module>
    pd.core.common.rands(3)

AttributeError: module 'pandas.core.common' has no attribute 'rands'

I use pandas version 0.18.0.

Upvotes: 9

Views: 9801

Answers (2)

MaxU - stand with Ukraine
MaxU - stand with Ukraine

Reputation: 210812

You can use:

In [14]: pd.util.testing.rands_array?
Signature: pd.util.testing.rands_array(nchars, size, dtype='O')
Docstring: Generate an array of byte strings.

Demo:

In [15]: N = 1000000

In [16]: s_arr = pd.util.testing.rands_array(10, N)

In [17]: s_arr
Out[17]: array(['L6d2GwhHdT', '5oki5T8VYm', 'XKUblAUFyL', ..., 'BE5AdCa62a', 'X3zDFKj6iy', 'iwASB9xZV3'], dtype=object)

In [18]: len(s_arr)
Out[18]: 1000000

UPDATE: from 2020-04-21

In newer Pandas versions you might see the following deprecation warning:

FutureWarning: pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.

in this case import this function as follows:

from pandas._testing import rands_array

Upvotes: 15

IanS
IanS

Reputation: 16241

There are several solutions:

First solution:

The function rands appears to be in pandas.util.testing now:

pd.util.testing.rands(3)

Second solution:

Go straight for the underlying numpy implementation (as found in the pandas source code):

import string
RANDS_CHARS = np.array(list(string.ascii_letters + string.digits),
                       dtype=(np.str_, 1))

nchars = 3
''.join(np.random.choice(RANDS_CHARS, nchars))

Third solution:

Call numpy.random.bytes (check that it fulfils your requirements).

Fourth solution:

See this question for other suggestions.

Upvotes: 4

Related Questions