DJElbow
DJElbow

Reputation: 3463

Concatenate Columns as Index in Pandas

I am importing a text file into pandas, and would like to concatenate 3 of the columns from the file to make the index.

I am open to doing this in 1 or more steps. I can either do the conversion at the same time I create the DataFrame, or I can create the DataFrame and restructure it with the newly created column. Knowing how to do this both ways would be the most helpful for me.

I would eventually like the index to be value of concatenating the values in the first 3 columns.

Upvotes: 8

Views: 18762

Answers (2)

voithos
voithos

Reputation: 70552

If you're using read_csv to import your text file, there is an index_col argument that you can pass a list of column names or numbers to. This will end up creating a MultiIndex - I'm not sure if that suits your application.

If you want to explicitly concatenate your index together (assuming that they are strings), it seems you can do so with the + operator. (Warning, untested code ahead)

df['concatenated'] = df['year'] + df['month']
df.set_index('concatenated')

Upvotes: 1

joris
joris

Reputation: 139172

If your columns consist of strings, you can just use the + operator (addition in the context of strings is to concatenate them in python, and pandas follows this):

In [1]: import pandas as pd

In [2]: df = pd.DataFrame({'year':['2012', '2012'], 'month':['01', '02']})

In [3]: df
Out[3]:
  month  year
0    01  2012
1    02  2012

In [4]: df['concatenated'] = df['year'] + df['month']

In [5]: df
Out[5]:
  month  year concatenated
0    01  2012       201201
1    02  2012       201202

And then, if this column is created, you can just use set_index to change the index

In [6]: df = df.set_index('concatenated')

In [7]: df
Out[7]:
             month  year
concatenated
201201          01  2012
201202          02  2012

Note that pd.concat is not to 'concat'enate strings but to concatenate series/dataframes, so to add columns or rows of different dataframes or series together into one dataframe (not several rows/columns into one row/column). See http://pandas.pydata.org/pandas-docs/dev/merging.html for an extensive explanation of this.

Upvotes: 14

Related Questions