Reputation: 3001
I have got a dataFrame which looks like this:
index | in | out | time
7 | 8 | 8 | 232
11 | 3 | 0 | 0
79 | 0 | 8 | 12
And I want to create a DataFrame out of this one, where every non-zero in/out
value is set to 1 (they are all positive). Time
and index
should be the same:
index | in | out | time
7 | 1 | 1 | 232
11 | 1 | 0 | 0
79 | 0 | 1 | 12
I think there should be a faster way, than how I am doing this:
df2 = pd.DataFrame({"index":[], "in":[], "out":[], "time":[]})
for index, row in df.iterrows():
if row["in"] == 0:
in_val = 0
else:
in_val = 1
if row["out"] == 0:
out_val = 0
else:
out_val = 1
time = row["time"]
df2 = df2.append(pd.DataFrame({"index":[index], "in":[in_val], "out":[out_val], "time":[time]}), sort=False)
Can I use some lambda function or something like a list comprehension to convert the dataframe faster?
Upvotes: 2
Views: 109
Reputation: 141
You can try
df['in'] = [1 if i>0 else 0 for i in list(df['in'])]
Upvotes: 0
Reputation: 11927
So you have a dataframe like this,
index in out time
0 7 8 8 232
1 11 3 0 0
2 79 0 8 12
Use np.where
to get the desired result like this,
df['in'] = np.where(df['in'] > 0, 1, 0)
df['out' = np.where(df['out'] > 0, 1, 0)
Upvotes: 0
Reputation: 75100
Alternatively you can use astype
to convert to boolean and multiply with 1:
cols=['in','out']
df[cols]=df[cols].astype(bool)*1
index in out time
0 7 1 1 232
1 11 1 0 0
2 79 0 1 12
Upvotes: 1
Reputation: 3739
use np.where()
df=pd.DataFrame(data={"in":[8,3,0],
"out":[8,0,8],
"time":[232,0,12]})
df[['in','out']] = np.where(df[['in','out']] == 0, 0, 1)
in out time
0 1 1 232
1 1 0 0
2 0 1 12
Upvotes: 0
Reputation: 863166
Use numpy.where
with columns with lists:
cols = ['in','out']
df[cols] = np.where(df[cols].eq(0), 0, 1)
Or cast boolean mask for not equal to integers:
df[cols] = df[cols].ne(0).astype(int)
If no negative values use DataFrame.clip
:
df[cols] = df[cols].clip(upper=1)
print (df)
index in out time
0 7 1 1 232
1 11 1 0 0
2 79 0 1 12
Upvotes: 4