Reputation: 463
I am having trouble to concatenate 2 simple DataFrames
. I upload first one .txt
file containing the data set, and then another one containing the header
of the previous dataset.
First I upload the 2 DataFrames:
df = pd.read_csv(file_dir + file_name, sep = ',', header = None, encoding = 'latin-1', low_memory = False)
df_column_names = pd.read_csv(file_dir + file_name_cols, sep = ',', header = None, encoding = 'latin-1', low_memory = False)
Afterwards, I create a list
of the header's DataFrame
by first transposing the table, and the converting it into a list
:
list_names = df_column_names.T.values.tolist()
Then, I finally create the desired DataFrame
:
df.columns = list_names
But I receive the following error message:
ValueError: Length mismatch: Expected axis has 26 elements, new values have 1 elements
The dimensions of my objects are:
df
of size (204,26) and type DataFrame
, df_column_names
is size (1,26) and type DataFrame
, list_names
is size 26 and type list
.
After reading other threads, the most similars were here, and here. Nevertheless, after checking the indexes of my two DataFrames, both seem OK:
In [4]: print(df.index)
RangeIndex(start=0, stop=205, step=1)
In [5]: print(df_column_names.index)
RangeIndex(start=0, stop=1, step=1)
In [6]: len(list_names)
Out[6]: 26
The look of list_names
is the following:
In [7]: list_names
Out[7]:
[['symboling'],
['normalized-losses'],
['make'],
['fuel-type'],
['aspiration'],
['num-of-doors'],
['body-style'],
['drive-wheels'],
['engine-location'],
['wheel-base'],
['length'],
['width'],
['height'],
['curb-weight'],
['engine-type'],
['num-of-cylinders'],
['engine-size'],
['fuel-system'],
['bore'],
['stroke'],
['compression-ratio'],
['horsepower'],
['peak-rpm'],
['city-mpg'],
['highway-mpg'],
['price']]
Thanks in advance for your help and advice.
Upvotes: 0
Views: 912
Reputation: 164643
Your list_names
is a list of lists. The requirement is to have a flat list.
You need to amend this line:
list_names = df_column_names.T.values.tolist()
To this:
df_column_names = df_column_names.transpose() # transpose dataframe if necessary
list_names = df_column_names[0].tolist()
You need to transpose your dataframe, as above, if your column names are in the first row rather than first column.
Upvotes: 1