Reputation: 3463
I am importing a text file into pandas, and would like to concatenate 3 of the columns from the file to make the index.
I am open to doing this in 1 or more steps. I can either do the conversion at the same time I create the DataFrame, or I can create the DataFrame and restructure it with the newly created column. Knowing how to do this both ways would be the most helpful for me.
I would eventually like the index to be value of concatenating the values in the first 3 columns.
Upvotes: 8
Views: 18762
Reputation: 70552
If you're using read_csv
to import your text file, there is an index_col
argument that you can pass a list of column names or numbers to. This will end up creating a MultiIndex
- I'm not sure if that suits your application.
If you want to explicitly concatenate your index together (assuming that they are strings), it seems you can do so with the +
operator. (Warning, untested code ahead)
df['concatenated'] = df['year'] + df['month']
df.set_index('concatenated')
Upvotes: 1
Reputation: 139172
If your columns consist of strings, you can just use the +
operator (addition in the context of strings is to concatenate them in python, and pandas follows this):
In [1]: import pandas as pd
In [2]: df = pd.DataFrame({'year':['2012', '2012'], 'month':['01', '02']})
In [3]: df
Out[3]:
month year
0 01 2012
1 02 2012
In [4]: df['concatenated'] = df['year'] + df['month']
In [5]: df
Out[5]:
month year concatenated
0 01 2012 201201
1 02 2012 201202
And then, if this column is created, you can just use set_index
to change the index
In [6]: df = df.set_index('concatenated')
In [7]: df
Out[7]:
month year
concatenated
201201 01 2012
201202 02 2012
Note that pd.concat
is not to 'concat'enate strings but to concatenate series/dataframes, so to add columns or rows of different dataframes or series together into one dataframe (not several rows/columns into one row/column). See http://pandas.pydata.org/pandas-docs/dev/merging.html for an extensive explanation of this.
Upvotes: 14