Wendy
Wendy

Reputation: 11

replace certain number in DataFrame

I am pretty new to Python programming and have a question about replacing certain conditional number in a DataFrame. for example, I have a dateframe with 5 days of data in each column, day1, day2, day3, day4 and day5. For each day, I have 5 data points with some of them larger than 5 for each day. Now I want to set the data which is larger than 5 to 1. So how can I do that? Loop into each column and find specific element then change it, or there is other faster way to do it? Thanks,

Upvotes: 1

Views: 2138

Answers (2)

brennan
brennan

Reputation: 3493

This will iterate over the data in each column and change high values to 1. Iterating by rows instead of columns is an option with iterrows as discussed here, but it's generally slower.

import pandas as pd


data = {'day1' : pd.Series([1, 2, 3]),
        'day2' : pd.Series([1, 4, 6]),
        'day3' : pd.Series([5, 4, 3]),
        'day4' : pd.Series([2, 4, 6]),
        'day5' : pd.Series([7, 3, 2])}

df = pd.DataFrame(data)

enter image description here

for col in df.columns:
    df[col] = [x if x <= 5 else 1 for x in df[col]]

enter image description here

Upvotes: 0

Mr.F
Mr.F

Reputation: 936

To do this without looping (which is usually faster) you can do:

df[df > 5] = 1

Upvotes: 1

Related Questions