Aditya Hariharan
Aditya Hariharan

Reputation: 145

Trying to remove the trailing .0 while printing integers from a pandas dataframe

There's a problem that's been eating at me for the past few days. I haven't been able to find a solution for this on SO or anywhere. Please bear in mind that I'm still in my python learning process. What I'm trying to do is remove the trailing '.0' from 2 columns in the pandas dataframe.

engine = sqlalchemy.create_engine(url, client_encoding='utf8')
def user_history_summary(userid=198):
connection = engine.connect()    
start_date = datetime.datetime(2016,8,6)
end_date = start_date+ datetime.timedelta(days=14)
last_date=datetime.datetime.now()
result = connection.execute(text(                                         
    "SELECT u.id as userid,CASE WHEN h.receiver_user_id = u.id AND h.sender_user_id IS NOT NULL THEN 'Received' WHEN h.sender_user_id=u.id THEN 'Given' ELSE NULL END AS Type, h.sentiment as sentiment, h.context as context,'{0}' as time_period,COUNT(*) as value" 
    " FROM \"User\" u, \"HoorahTransaction\" h"
    " WHERE (u.id= h.receiver_user_id OR u.id=h.sender_user_id) AND sentiment in ('+','-') AND h.created>'{0}' AND h.created<'{1}'"
    " group by userid,type,sentiment,context".format(start_date,end_date)))
answer= result.fetchall()
totalReceived= pd.DataFrame(answer,columns=["userId","Type","Sentiment","Context","TimePeriod","Value"])
counter=0
while start_date<last_date: 
    counter = counter + 1
    start_date = start_date+ datetime.timedelta(days=14)
    end_date = end_date+ datetime.timedelta(days=14)
    result = connection.execute(text(                                         
    "SELECT u.id as userid,CASE WHEN h.receiver_user_id = u.id AND h.sender_user_id IS NOT NULL THEN 'Received' WHEN h.sender_user_id=u.id THEN 'Given' ELSE NULL END AS Type, h.sentiment as sentiment, h.context as context,'{0}' as time_period,COUNT(*) as value" 
    " FROM \"User\" u, \"HoorahTransaction\" h"
    " WHERE (u.id= h.receiver_user_id OR u.id=h.sender_user_id) AND sentiment in ('+','-') AND h.created>'{0}' AND h.created<'{1}'"
    " group by userid,type,sentiment,context".format(start_date,end_date)))
    answer= result.fetchall()       
    df=pd.DataFrame(answer,columns=["userId","Type","Sentiment","Context","TimePeriod","Value"])
    totalReceived= totalReceived.append(df,ignore_index=True)         
return totalReceived
totalReceived = user_history_summary()
print(totalReceived)

Below is the output dataframe that I'm seeing

      userId      Type Sentiment   Context           TimePeriod  Value
0     204.0  Received         +      work  2016-08-06 00:00:00    1.0
1     208.0     Given         +      work  2016-08-06 00:00:00    5.0
2     220.0  Received         +      work  2016-08-06 00:00:00    3.0
3     199.0  Received         +      work  2016-08-06 00:00:00    2.0
4     218.0     Given         +      work  2016-08-06 00:00:00    2.0
5     199.0     Given         -      work  2016-08-06 00:00:00    1.0
6     210.0     Given         +      work  2016-08-06 00:00:00    3.0
7     200.0  Received         +      work  2016-08-06 00:00:00    8.0
8     207.0     Given         -      work  2016-08-06 00:00:00    1.0
9     206.0     Given         +      work  2016-08-06 00:00:00    6.0
10    198.0  Received         +      work  2016-08-06 00:00:00   34.0
11    212.0     Given         +      work  2016-08-06 00:00:00    1.0

I need to remove the trailing '.0' from the 'userId' and 'Value' column. The columns in the database from where the values are being taken from are both integer columns.

Upvotes: 1

Views: 2203

Answers (1)

Andrew Guy
Andrew Guy

Reputation: 9988

You can just convert the columns to the int datatype. It looks like they are currently being stored as float64.

for column in ['userId', 'Value']:
    totalRecieved[column] = totalRecieved[column].astype(int)

Upvotes: 3

Related Questions