Reputation: 1030
I have a pandas dataframe(roughly 7000 rows) that looks as follows:
Col1 Col2
12345 1234
678910 6789
I would like to delete the first 4 digits from col1, so as to end up with:
Col1 Col2
5 1234
10 6789
Or just separate the first column in 2 columns.
Upvotes: 0
Views: 557
Reputation: 210842
Separating first column into two new ones:
In [5]: df[['New1','New2']] = (df['Col1'].astype(str)
.str.extract(r'(\d{4})(\d+)', expand=True)
.astype(int))
In [6]: df
Out[6]:
Col1 Col2 New1 New2
0 12345 1234 1234 5
1 678910 6789 6789 10
In [9]: df.dtypes
Out[9]:
Col1 int64
Col2 int64
New1 int32
New2 int32
dtype: object
NOTE: this solution will work with Pandas version 0.18.0+
Upvotes: 3