Reputation: 41
I have a data frame, duration is one of the attributes. The duration's content is like:
array(['487', '346', ..., '227', '17']).
And the df.info(), I get: Data columns (total 22 columns):
duration 2999 non-null object
campaign 2999 non-null object
...
Now I want to convert duration into int. Is there any solution?
Upvotes: 3
Views: 9868
Reputation: 33773
Use astype
:
df['duration'] = df['duration'].astype(int)
Timings
Using the following setup to produce a large sample dataset:
n = 10**5
data = list(map(str, np.random.randint(10**4, size=n)))
df = pd.DataFrame({'duration': data})
I get the following timings:
%timeit -n 100 df['duration'].astype(int)
100 loops, best of 3: 10.9 ms per loop
%timeit -n 100 df['duration'].apply(int)
100 loops, best of 3: 44.3 ms per loop
%timeit -n 100 df['duration'].apply(lambda x: int(x))
100 loops, best of 3: 60.1 ms per loop
Upvotes: 5
Reputation: 5929
Use int(str)
:
df['duration'] = df['duration'].apply(lambda x: int(x)) #df is your dataframe with attribute 'duration'
Upvotes: 0