Wade Bratz
Wade Bratz

Reputation: 321

How can I delete portion of data in python pandas dataframe imported from csv?

I am trying to take the days out of the column Days_To_Maturity. So instead of Days 0, it will just be 0. I have tried a few things but am wondering if there is a easy way to do this built into python. Thanks

In[12]:
from pandas import *
XYZ = read_csv('XYZ')
df_XYZ = DataFrame(XYZ)
df_XYZ.head()

Out[12]:
       Dates    Days_To_Maturity    Yield
0    5/1/2002    Days 0                 0.00
1    5/1/2002    Days 1                 0.06
2    5/1/2002    Days 2                 0.12
3    5/1/2002    Days 3                 0.18
4    5/1/2002    Days 4                 0.23
5 rows × 3 columns

Upvotes: 2

Views: 96

Answers (2)

dhj
dhj

Reputation: 512

I think the solution you are looking for is in the "converters" option of the read_csv function of pandas. From help(pandas.read_csv):

converters: dict. optional Dict of functions for converting values in certain columns. Keys can either be integers or column labels.

So instead of read_csv('XYZ') you would make a custom converter:

myconverter = { 'Days_To_Maturity': lambda x: x.split(' ')[1] } read_csv('XYZ',converter=myconverter)

This should work. Please let me know if it helps!

Upvotes: 1

CT Zhu
CT Zhu

Reputation: 54330

You can explore the possibility of using .str method, either you can extract the numbers using regex, or take a slice .str.slice, or like in this example, replace days with a empty string:

In [109]:

df.Days_To_Maturity.str.replace('Days ','').astype(int)
Out[109]:
0    0
1    1
2    2
3    3
4    4
Name: Days_To_Maturity, dtype: int32

Upvotes: 2

Related Questions