shankar ram
shankar ram

Reputation: 177

Iterate over pandas dataframe and update values in columns for a specific condition

I have a Pandas dataframe with checkdataframe.shape (68125, 109). I want to perform a Operation in all the columns like I did it below for a single list.

def alter_column(column,batchSize=10):
return_list=[]
for idx,value in enumerate(column): 
        if (idx+1)%batchSize==1: 
            return_list.append(value)
        else:
            return_list.append(np.nan)
return return_list

which Returns a list with values removed over certain intervals of 10 like this Output

['175,5200',nan,nan,nan,nan,nan,nan,nan,nan,nan,'175,5200',nan,nan,nan,nan,nan,nan,nan,nan,nan,'180,0000']

I wanted it to do it over entire dataframe . i tried df.iteritems and df.iterrows but it Shows error. Any possible solution or way to do it?

eg:df['column1']=[1,1,1,1,1,1,1,1,1,1,2,2,2,2,2,2,2,2,2,2,2]
   df['column2']=[3,3,3,3,3,3,3,3,3,3,4,4,4,4,4,4,4,4,4,4,4]
expected_output:
column1=['1',nan,nan,nan,nan,nan,nan,nan,nan,nan,'2',nan,nan,nan,nan,nan,nan,nan,nan,nan] column2=['3',nan,nan,nan,nan,nan,nan,nan,nan,nan,'4',nan,nan,nan,nan,nan,nan,nan,nan,nan]   

But my real dataset has 109 columns

Upvotes: 0

Views: 80

Answers (1)

Hugolmn
Hugolmn

Reputation: 1560

If the index of your dataframe is 0 .. n you can apply this:

df[~df.index.isin(np.arange(0, df.shape[0], batchSize))] = np.nan

This way you keep only every 10 rows as not np.nan

Upvotes: 1

Related Questions