Reputation: 1224
Good day,
I have a data frame with a column like this, let us assume with 1000rows but here is a sample:
A
12
24
36
48
I wish to split the number into two separate numbers. I want the output to look like this:
A B C
12 1 2
24 2 4
36 3 6
48 4 8
How can I achieve this using Pandas and Numpy? Help would be truly appreciated. Thanks in advance!
Upvotes: 1
Views: 144
Reputation: 7466
Another approach based on splitting every character of the number treated as string:
df = pd.DataFrame([12, 24, 36, 48], columns=['A'])
values = df['A'].values
split = [list(str(el)) for el in values]
out = pd.DataFrame(split, columns=['B', 'C']).astype(int)
which gives:
out
B C
0 1 2
1 2 4
2 3 6
3 4 8
Upvotes: 2
Reputation: 862511
Use floor
and mod
:
df['B'] = df['A'] // 10
df['C'] = df['A'] % 10
print (df)
A B C
0 12 1 2
1 24 2 4
2 36 3 6
3 48 4 8
If input data are strings is possible indexing by positions by []
:
print (df['A'].apply(type))
0 <class 'str'>
1 <class 'str'>
2 <class 'str'>
3 <class 'str'>
Name: A, dtype: object
df['B'] = df['A'].str[0]
df['C'] = df['A'].str[1]
#if necessary convert all columns to integers
df = df.astype(int)
print (df)
A B C
0 12 1 2
1 24 2 4
2 36 3 6
3 48 4 8
Upvotes: 3
Reputation: 393943
For a df that size use floordiv
and mod
:
In[141]:
df['B'] = df['A'].floordiv(10)
df['C'] = df['A'].mod(10)
df
Out[141]:
A B C
0 12 1 2
1 24 2 4
2 36 3 6
3 48 4 8
There are also the numpy
equivalents, np.floor_divide
and np.mod
:
In[142]:
df['B'] = np.floor_divide(df['A'],10)
df['C'] = np.mod(df['A'],10)
df
Out[142]:
A B C
0 12 1 2
1 24 2 4
2 36 3 6
3 48 4 8
The numpy versions are faster:
%%timeit
df['B'] = df['A'].floordiv(10)
df['C']= df['A'].mod(10)
1000 loops, best of 3: 733 µs per loop
%%timeit
df['B'] = np.floor_divide(df['A'],10)
df['C'] = np.mod(df['A'],10)
1000 loops, best of 3: 491 µs per loop
Upvotes: 3
Reputation: 210832
In [15]: df.A.astype(str).str.extractall(r'(.)')[0].unstack().astype(np.int8)
Out[15]:
match 0 1
0 1 2
1 2 4
2 3 6
3 4 8
Upvotes: 2