rk jp
rk jp

Reputation: 19

Same value when random is used

I have a same value when I am using this code. What am I doing wrong in the random?

data = data[data["VN"] >= 1000]
data_T1 = data[data["TARGET"] == 1]
data_T0 = data[data["TARGET"] == 0]
data_T0_random = data_T0.loc[np.random.choice(data_T0.index, 10000)]
data = data_T1.append(data_T0_random)
print('q:', len(data.index))
rr = data.drop_duplicates()
print('qq:', len(rr.index))

Upvotes: 0

Views: 53

Answers (2)

iRhonin
iRhonin

Reputation: 383

Change this line:

data_T0_random=data_T0.loc[np.random.choice(data_T0.index, 10000)]

to:

data_T0_random=random.sample(data_T0,10000)

More info:

random.choices(population, weights=None, *, cum_weights=None, k=1) Return a k sized list of elements chosen from the population with replacement. If the population is empty, raises IndexError.

random.sample(population, k) Return a k length list of unique elements chosen from the population sequence or set. Used for random sampling without replacement.

Upvotes: 0

Rakesh
Rakesh

Reputation: 82795

Use replace=False

Ex:

data_T0_random=data_T0.loc[np.random.choice(data_T0.index, 10000, replace=False)]

Upvotes: 1

Related Questions