Reputation: 367
I have function get_differences
which output is dictionary and looks like below.
Numbers in this case are not relevant, it is just example of the output generated by this function:
get_differences(column = 'column_A', percent = 10)
{'Feature': 'column_A',
'Pos_obs_10%': -0.98,
'Pos_obs_target': 1,
'Pos_obs_-10%': -1.23}
To get Pandas Dataframe with all columns I were doing like this:
full_output = []
for col in df.columns:
output = get_differences(column = col, percent = 10)
full_output.append(output)
df_output = pd.DataFrame(full_output)
By executing this code my results looks like this:
Feature Pos_obs_-10% Pos_obs_target Pos_obs_10%
0 column_A -0.98 -1.96 -0.98
1 column_B -0.23 0.00 0.55
2 column_C 1.55 -2.94 4.90
3 column_D -0.23 0.98 -0.98
Which is also correct. But I would like to get results from this function in Pandas Dataframe for every column and range of percent. For example for 10, 50 and 100%.
My desired output is:
Feature Pos_obs_-100$ Pos_obs_-50 Pos_obs_-10% Pos_obs_target Pos_obs_10% Pos_obs_50% Pos_obs_100%
0 column_A -0.98 -1.96 -0.98 -0.98 -1.96 -0.98 -0.98
1 column_B -0.23 0.00 0.55 -0.98 -1.96 -0.98 -0.98
2 column_C 1.55 -2.94 4.90 -0.98 -1.96 -0.98 -0.98
3 column_D -0.23 0.98 -0.98 -0.98 -1.96 -0.98 -0.98
Numbers here are also random just to show example output.When I tried loop like this:
percentage = range(1,5)
full_output_acrylamide = []
for n in percentage:
for col in df.columns:
output = get_differences(column = col, percent = n)
full_output.append(output)
df_output = pd.DataFrame(full_output)
I got a lot of NaN in DataFrame and columns were repeating, something like this:
Feature Pos_obs_-100$ Pos_obs_-50 Pos_obs_-10% Pos_obs_target Pos_obs_10% Pos_obs_50% Pos_obs_100%
0 column_A 0.00 NaN -0.98 -0.98 -1.96 NaN -0.98
1 column_B -2.96 NaN 0.55 -0.98 -1.96 NaN -0.98
2 column_C 0.00 NaN 4.90 -0.98 -1.96 NaN -0.98
3 column_D -0.98 NaN -0.98 -0.98 -1.96 NaN -0.98
4 column_A -0.98 -0.12 NaN -0.98 NaN -0.98 -0.98
5 column_B -0.23 0.55 NaN -0.98 NaN -0.98 -0.98
6 column_C 1.55 4.90 NaN -0.98 NaN -0.98 -0.98
7 column_D -0.23 -0.98 NaN -0.98 NaN -0.98 -0.98
Upvotes: 1
Views: 816
Reputation: 863246
Create DataFrame
in inner loop, append to another list and last use concat
:
percentage = range(1,5)
dfs = []
for n in percentage:
L = []
for col in df.columns:
output = get_differences(column = col, percent = n)
L.append(output)
dfs.append(pd.DataFrame(L))
df_output = pd.concat(dfs, axis=1)
Upvotes: 1