Reputation: 3
I am struggling with an issue regarding CSV files and Python. How would I generate a random number in a csv file row, based of a condition in that row.
Essentially, if the value in the first column is 'A' I want a random number between (1 and 5). If the value is B I want a random number between (2 and 6) and if the value is C, and random number between (3 and 7).
Letter | Color | Random Number |
---|---|---|
A | Green | |
C | Red | |
B | Red | |
B | Green | |
C | Blue |
Thanks in advance
The only thing I have found was creating a new random number dataframe. But I need to create a random number for an existing df.
Upvotes: -1
Views: 2147
Reputation: 77337
You can use numpy's random
module which will work on pandas series. First, create a series that maps each letter to the starting value of the random range. Map that to the "Letter" column and you'll have a series of random range start values. Use that with numpy.random.randint
to generate the the new column.
>>> l_map = pd.Series([1,2,3], index=['A', 'B', 'C'], name="mapped")
>>> l_map
A 1
B 2
C 3
Name: mapped, dtype: int64
>>> l_code = df["Letter"].map(l_map)
>>> l_code
0 1
1 3
2 2
3 2
4 3
Name: Letter, dtype: int64
>>> df["Random Number"] = np.random.randint(l_code, l_code+5)
>>> df
Letter Color Random Number
0 A Green 5
1 C Red 6
2 B Red 2
3 B Green 3
4 C Blue 3
Upvotes: 0
Reputation: 1
Here is a simple way doing it without using pandas. this program modifies the third column by random number from a CSV file:
if the value in the first column is 'A' I want a random number between (1 and 5). If the value is B I want a random number between (2 and 6) and if the value is C, and random number between (3 and 7).
import csv
import random
letters_randoms = {
'A': [1, 5],
'B': [2, 6],
'C': [3, 7],
}
rows = [] #result
with open('file.csv', 'r', encoding='utf-8') as file:
reader = csv.reader(file)
rows.append(next(reader)) # Skip the first line (header)
for row in reader:
letter = row[0].upper()
row[2] = random.randint(letters_randoms[letter]
[0], letters_randoms[letter][1])# or just *letters_randoms[letter]
rows.append(row)
# modify csv file
with open('file.csv', 'w', newline='', encoding='utf-8') as file:
writer = csv.writer(file)
writer.writerows(rows)
Result:(file.csv)
LETTER,COLOR,Random Number
A,Green,3
c,Red,5
B,Red,2
B,Green,2
c,Blue,5
A,Purple,5
B,Green,3
A,Orange,3
c,Black,4
c,Red,5
Upvotes: 0
Reputation: 37737
One of the ways is to use numpy.random.randint
with numpy.select
:
import pandas as pd
import numpy as np
df = pd.read_csv("inputfile.csv", sep=",")
#change the separator according to the actual format of your csv
categories = [df["Letter"].eq("A"),
df["Letter"].eq("B"),
df["Letter"].eq("C")]
#random.randint(low, high=None, size=None, dtype=int)
choices = [np.random.randint(1, 5+1), #high is exclusive
np.random.randint(2, 6+1), #high is exclusive
np.random.randint(3, 7+1)] #high is exclusive
#numpy.select(condlist, choicelist, default=0)
df["Random Number"] = np.select(categories, choices)
print(df)
Letter Color Random Number
0 A Green 5
1 C Red 6
2 B Red 5
3 B Green 5
4 C Blue 6
If needed, you can use pandas.DataFrame.to_csv
to generate a new (.csv
) :
df.to_csv("output_file.csv", sep=",", index=False)
Upvotes: 1