Reputation: 373
My dataframe is like df.columns= ['Time1','Pmpp1','Time2',..........,'Pmpp96']
I want to select two successive columns at a time. Example, Time1
,Pmpp1
at a time.
My code is:
for i,j in zip(df.columns,df.columns[1:]):
print(i,j)
My present output is:
Time1 Pmmp1
Pmmp1 Time2
Time2 Pmpp2
Expected output is:
Time1 Pmmp1
Time2 Pmpp2
Time3 Pmpp3
Upvotes: 3
Views: 143
Reputation: 373
After a series of trials, I got it. My code is given below:
for a in range(0,len(df.columns),2):
print(df.columns[a],df.columns[a+1])
My output is:
DateTime A016.Pmp_ref
DateTime.1 A024.Pmp_ref
DateTime.2 A040.Pmp_ref
DateTime.3 A048.Pmp_ref
DateTime.4 A056.Pmp_ref
DateTime.5 A064.Pmp_ref
DateTime.6 A072.Pmp_ref
DateTime.7 A080.Pmp_ref
DateTime.8 A096.Pmp_ref
DateTime.9 A120.Pmp_ref
DateTime.10 A124.Pmp_ref
DateTime.11 A128.Pmp_ref
Upvotes: 0
Reputation: 164773
As an alternative to integer positional slicing, you can use str.startswith
to create 2 index objects. Then use zip
to iterate over them pairwise:
df = pd.DataFrame(columns=['Time1', 'Pmpp1', 'Time2', 'Pmpp2', 'Time3', 'Pmpp3'])
times = df.columns[df.columns.str.startswith('Time')]
pmpps = df.columns[df.columns.str.startswith('Pmpp')]
for i, j in zip(times, pmpps):
print(i, j)
Time1 Pmpp1
Time2 Pmpp2
Time3 Pmpp3
Upvotes: 1
Reputation: 1818
In this kind of scenario, it might make sense to reshape your DataFrame. So instead of selecting two columns at a time, you have a DataFrame with the two columns that ultimately represent your measurements.
First, you make a list of DataFrames, where each one only has a Time and Pmpp column:
dfs = []
for i in range(1,97):
tmp = df[['Time{0}'.format(i),'Pmpp{0}'.format(i)]]
tmp.columns = ['Time', 'Pmpp'] # Standardize column names
tmp['n'] = i # Remember measurement number
dfs.append(tmp) # Keep with our cleaned dataframes
And then you can join them together into a new DataFrame. That has three columns.
new_df = pd.concat(dfs, ignore_index=True, sort=False)
This should be a much more manageable shape for your data.
>>> new_df.columns
[n, Time, Pmpp]
Now you can iterate through the rows in this DataFrame and get the values for your expected output
for i, row in new_df.iterrows():
print(i, row.n, row.Time, row.Psmpp)
It also will make it easier to use the rest of pandas to analyze your data.
new_df.Pmpp.mean()
new_df.describe()
Upvotes: 0
Reputation: 3971
You're zipping on the list, and the same list starting from the second element, which is not what you want. You want to zip on the uneven and even indices of your list. For example, you could replace your code with:
for i, j in zip(df.columns[::2], df.columns[1::2]):
print(i, j)
Upvotes: 5