Luis E.
Luis E.

Reputation: 851

How to create a column by applying a function to the (non-trivial) index?

I may get downvoted by this question, but so far I have been unable to wrap my head around this problem. I have a DataFrame which looks like this:

                                    Hits          Last visit  Bandwidth  IsWeird
Host
vocms241.cern.ch                    3777 2013-11-28 16:03:00      27554    False
ekpsquid.physik.uni-karlsruhe.de    4132 2013-11-28 14:54:00      99235     True
ec-slc6-x86-64-spi-4.cern.ch         949 2013-11-28 02:04:00    1004236    False
ec-slc6-x86-64-spi-3.cern.ch         949 2013-11-28 02:37:00    1004544    False
ec-slc6-x86-64-spi-2.cern.ch         949 2013-11-28 02:01:00    1004103    False

So you see, the index of the DataFrame is a string. Now, I have a function get_something that maps the hosts in the index to another string, and I want to add the results as a new column:

                                    Hits          Last visit  Bandwidth  IsWeird                NewField
Host
vocms241.cern.ch                    3777 2013-11-28 16:03:00      27554    False            STRING-0-0-1
ekpsquid.physik.uni-karlsruhe.de    4132 2013-11-28 14:54:00      99235     True  AnotherDifferentString
ec-slc6-x86-64-spi-4.cern.ch         949 2013-11-28 02:04:00    1004236    False          No_String_here
ec-slc6-x86-64-spi-3.cern.ch         949 2013-11-28 02:37:00    1004544    False                    None
ec-slc6-x86-64-spi-2.cern.ch         949 2013-11-28 02:01:00    1004103    False   I_dont-Know-what_else

My convoluted way of achieving this currently is: (assume the DataFrame is df and pandas is imported as pd):

_temp = pd.DataFrame(df.reset_index()['Host'])
_temp['NewField'] = _temp.Host.apply(get_something)
_temp.set_index('Host', inplace=True)
df = pd.merge(df, _temp, left_index=True, right_index=True)

But I cannot believe it would take that much code to achieve that.

Upvotes: 2

Views: 112

Answers (2)

Luis E.
Luis E.

Reputation: 851

After working on some other stuff and coming back to this issue several times, I have settled on a not so convoluted way of doing this:

df['NewField'] = df.index                            # Copy the contents of 
                                                     # the index to a new column.
df['NewField'] = df['NewField'].apply(get_something) # Apply the function

Upvotes: 0

roman
roman

Reputation: 117420

May be like this?

df['NewField'] = pd.Series(df.index).apply(get_something)

Upvotes: 2

Related Questions