Lorzen
Lorzen

Reputation: 185

Creating multiindex header in Pandas

I have a data frame in form of a time series looking like this:

Table 1: original DataFrame

and a second table with additional information to the according column(names) like this:

Table 2: additional information

Now, I want to combine the two, adding specific information from the second table into the header of the first one. With a result like this:

Table 3: final DataFrame

I have the feeling the solution to this is quite trivial, but somehow I just cannot get my head around it. Any help/suggestions/hints on how to approach this?

MWE to create to tables:

import pandas as pd
df = pd.DataFrame([[1, 2, 3], [4, 5, 6], [7, 8, 9]],columns=['a', 'b', 'c'])
df2 = pd.DataFrame([['a','b','c'],['a_desc','b_desc','c_desc'],['a_unit','b_unit','c_unit']]).T
df2.columns=['MSR','OBJDESC','UNIT']

Upvotes: 0

Views: 383

Answers (1)

Matt
Matt

Reputation: 1284

You could get a metadata dict for each of the original column names and then update the original df

# store the column metadata you want in the header here
header_metadata = {}

# loop through your second df
for i, row in df2.iterrows():
    # get the column name that this corresponds to
    column_name = row.pop('MSR')
    
    # we don't want `scale` metadta
    row.pop('SCALE')
    
    # we will want to add the data in dict(row) to our first df
    header_metadata[column_name] = dict(row)

# rename the columns of your first df
df1.columns = (
    '\n'.join((c, *header_metadata[c]))
    for c in df1.columns
)

Upvotes: 1

Related Questions