Reputation: 176
I have a .csv file that contains four columns and about 700000 rows. My problem is I can't access a specific cell, but only the whole row. My code:
import pandas as pd
data = pd.read_csv("example.csv")
entries = data["entry"].astype(str)
payments = data["payment_type"].astype(str)
origins = data["origin"].astype(str)
for row in entries:
if row[26] == "Y":
data["payment_type"] = "sample"
if row[27] == "Y":
data["payment_type"] = "Check Card"
...
E.g. In the first row of the .csv file, I want to format the cell in the "origin" column, according to the "entry" column of the same row. The script does this, but as written now, it formats the whole column according to what the value of the last entry is. I think my problem is in the "for" loop, on how to access the specific row of a column.
Any help would be appreciated.
Thank you in advance.
Upvotes: 2
Views: 1152
Reputation: 343
you could use np.where
function and define rules when to format rows that match it. Or if you have multiple rules and multiple conditions, you could use np.select
.
Upvotes: 2
Reputation: 184
You are replacing the whole column, you have to add the row for each column
import pandas as pd
data = pd.read_csv("example.csv")
entries = data["entry"].astype(str)
payments = data["payment_type"].astype(str)
origins = data["origin"].astype(str)
for row in entries:
if row[26] == "Y":
data["payment_type"][row] = "sample"
if row[27] == "Y":
data["payment_type"][row] = "Check Card"
...
Upvotes: 1