How to feed random numbers as indices to pandas data frame?

Question

I'm trying to get a random sample from two pandas frames. If rows (random) 2,5,8 are selected in frame A, then the same 2,5,8 rows must be selected from frame B. I did it by first generating a random sample and now want to use this sample as indices for rows for frame. How can I do it? The code should look like

idx = list(random.sample(range(X_train.shape[0]),5000))

lgstc_reg[i].fit(X_train[idx,:], y_train[idx,:])

However, running the code gives an error.

Ian · Accepted Answer

Use iloc:

indexes = [2,5,8]  # in your case this is the randomly generated list
A.iloc[indexes]
B.iloc[indexes]

An alternative consistent sampling methodology would be to set a random seed, and then sample:

random_seed = 42
A.sample(3, random_state=random_seed)
B.sample(3, random_state=random_seed)

The sampled DataFrames will have the same index.

How to feed random numbers as indices to pandas data frame?

Answers (2)

Related Questions