Reputation: 193
I'm noob with pandas, and recently, I got that 'ValueError' when I'm trying to modify the columns that follows some rules, as:
csv_input = pd.read_csv(fn, error_bad_lines=False)
if csv_input['ip.src'] == '192.168.1.100':
csv_input['flow_dir'] = 1
csv_input['ip.src'] = 1
csv_input['ip.dst'] = 0
else:
if csv_input['ip.dst'] == '192.168.1.100':
csv_input['flow_dir'] = 0
csv_input['ip.src'] = 0
csv_input['ip.dst'] = 1
I was searching about this error and I guess that it's because the 'if' statement and the '==' operator, but I don't know how to fix this.
Thanks!
Upvotes: 1
Views: 672
Reputation: 2293
So Andrew L's comment is correct, but I'm going to expand on it a bit for your benefit.
When you call, e.g.
csv_input['ip.dst'] == '192.168.1.100'
What this returns is a Series, with the same index as csv_input, but all the values in that series are boolean, and represent whether the value in csv_input['ip.dst']
for that row is equal to '192.168.1.100'
.
So, when you call
if csv_input['ip.dst'] == '192.168.1.100':
You're asking whether that Series evaluates to True or False. Hopefully that explains what it meant by The truth value of a Series is ambiguous.
, it's a Series, it can't be boiled down to a boolean.
Now, what it looks like you're trying to do is set the values in the flow_dir
,ip.src
& ip.dst
columns, based on the value in the ip.src
column.
The correct way to do this is would be with .loc[]
, something like this:
#equivalent to first if statement
csv_input.loc[
csv_input['ip.src'] = '192.168.1.100',
('ip.src','ip.dst','flow_dir')
] = (1,0,1)
#equivalent to second if statement
csv_input.loc[
csv_input['ip.dst'] = '192.168.1.100',
('ip.src','ip.dst','flow_dir')
] = (0,1,0)
Upvotes: 0