Add a suffix to a dataframe called from a dictionary

Question

I am trying to add a suffix to the dataframes called on by a dictionary.

Here is a sample code below:

import pandas as pd
import numpy as np
from collections import OrderedDict
from itertools import chain

# defining stuff
num_periods_1 = 11
num_periods_2 = 4
num_periods_3 = 5

# create sample time series
dates1 = pd.date_range('1/1/2000 00:00:00', periods=num_periods_1, freq='10min')
dates2 = pd.date_range('1/1/2000 01:30:00', periods=num_periods_2, freq='10min')
dates3 = pd.date_range('1/1/2000 02:00:00', periods=num_periods_3, freq='10min')

# column_names = ['WS Avg','WS Max','WS Min','WS Dev','WD Avg']
# column_names = ['A','B','C','D','E']
column_names_1 = ['C', 'B', 'A']
column_names_2 = ['B', 'C', 'D']
column_names_3 = ['E', 'B', 'C']

df1 = pd.DataFrame(np.random.randn(num_periods_1, len(column_names_1)), index=dates1, columns=column_names_1)
df2 = pd.DataFrame(np.random.randn(num_periods_2, len(column_names_2)), index=dates2, columns=column_names_2)
df3 = pd.DataFrame(np.random.randn(num_periods_3, len(column_names_3)), index=dates3, columns=column_names_3)

sep0 = '<~>'
suf1 = '_1'
suf2 = '_2'
suf3 = '_3'

ddict = {'df1': df1, 'df2': df2, 'df3': df3}
frames_to_concat = {'Sheets': ['df1', 'df3']}

Suffs = {'Suffixes': ['Suffix 1', 'Suffix 2', 'Suffix 3']}
Suff = {'Suffix 1': suf1, 'Suffix 2': suf2, 'Suffix 3': suf3}

## appply suffix to each data frame selected in order HERE
# Suffdict = [Suff[x] for x in Suffs['Suffixes']]
# print(Suffdict)

df4 = pd.concat([ddict[x] for x in frames_to_concat['Sheets']],
                axis=1,
                join='outer')

I want to add a suffix to each dataframe so that they can be distinguished when the dataframes are concatenated. I am having some trouble calling them and then applying them to each dataframe. So I have called for df1 and df3 to be concatenated and I would like only suffix 1 to be applied to df1 and suffix 2 to be applied to df3.

Order does not matter for the data frame suffix if df2 and df3 were called suffix 1 would be applied to df2 and suffix 2 would be applied to df3. obviously the last suffix would not be used.

cs95 · Accepted Answer

Unless you have python3.6, you cannot guarantee order in dictionaries. Even if you could with python3.6, that would imply your code would not run in any lower python version. If you need order, you should be looking at lists instead.

You can store your dataframes as well as your suffixes in a list, and then use zip to add a suffix to each df in turn.

dfs = [df1, df2, df3]
sufs = [suf1, suf2, suf3]

df_sufs = [x.add_suffix(y) for x, y in zip(dfs, sufs)]

Based on your code/answer, you can load your dataframes and suffixes into lists, call zip, add a suffix to each one, and call pd.concat.

dfs = [ddict[x] for x in frames_to_concat['Sheets']]
sufs = [suff[x] for x in suffs['Suffixes']]

df4 = pd.concat([x.add_suffix(sep0 + y) 
          for x, y in zip(dfs, sufs)], axis=1, join='outer')

Add a suffix to a dataframe called from a dictionary

Answers (2)

Related Questions