Concatenating header list to Dataframe in pandas

Question

I am having trouble to concatenate 2 simple DataFrames. I upload first one .txt file containing the data set, and then another one containing the header of the previous dataset.

First I upload the 2 DataFrames:

df = pd.read_csv(file_dir + file_name, sep = ',', header = None, encoding = 'latin-1', low_memory = False)
df_column_names = pd.read_csv(file_dir + file_name_cols, sep = ',', header = None, encoding = 'latin-1', low_memory = False)

Afterwards, I create a list of the header's DataFrame by first transposing the table, and the converting it into a list:

list_names = df_column_names.T.values.tolist()

Then, I finally create the desired DataFrame:

df.columns = list_names

But I receive the following error message:

ValueError: Length mismatch: Expected axis has 26 elements, new values have 1 elements

The dimensions of my objects are: df of size (204,26) and type DataFrame, df_column_names is size (1,26) and type DataFrame, list_names is size 26 and type list.

After reading other threads, the most similars were here, and here. Nevertheless, after checking the indexes of my two DataFrames, both seem OK:

In [4]: print(df.index)
RangeIndex(start=0, stop=205, step=1)

In [5]: print(df_column_names.index)
RangeIndex(start=0, stop=1, step=1)

In [6]: len(list_names)
Out[6]: 26

The look of list_names is the following:

In [7]: list_names
Out[7]: 
[['symboling'],
 ['normalized-losses'],
 ['make'],
 ['fuel-type'],
 ['aspiration'],
 ['num-of-doors'],
 ['body-style'],
 ['drive-wheels'],
 ['engine-location'],
 ['wheel-base'],
 ['length'],
 ['width'],
 ['height'],
 ['curb-weight'],
 ['engine-type'],
 ['num-of-cylinders'],
 ['engine-size'],
 ['fuel-system'],
 ['bore'],
 ['stroke'],
 ['compression-ratio'],
 ['horsepower'],
 ['peak-rpm'],
 ['city-mpg'],
 ['highway-mpg'],
 ['price']]

Thanks in advance for your help and advice.

jpp · Accepted Answer

Your list_names is a list of lists. The requirement is to have a flat list.

You need to amend this line:

list_names = df_column_names.T.values.tolist()

To this:

df_column_names = df_column_names.transpose() # transpose dataframe if necessary
list_names = df_column_names[0].tolist()

You need to transpose your dataframe, as above, if your column names are in the first row rather than first column.

Concatenating header list to Dataframe in pandas

Answers (1)

Related Questions