miatochai
miatochai

Reputation: 343

Edit dataframe column values to a certain format

I have the following dataframe (df4):

    hour  counted
0      0      0.0
1      1      0.0
2      2      0.0 
3      3      0.0
4      4      0.0
5      5      0.0
6      6      0.0
7      7      0.0
8      8      0.0
9      9      0.0
10    10    792.0
11    11    792.0
12    12      0.0
13    13      0.0
14    14    594.0
15    15    198.0
16    16    198.0
17    17      0.0
18    18      0.0
19    19      0.0
20    20      0.0
21    21      0.0
22    22      0.0
23    23      0.0

Which I later use in a Dash Plotly chart. Instead of a single number for an hour, I want to display the values as "hour:00 - hour:59".

So, the dataframe should look like this:

       hour_display     counted
0      0:00 - 0:59      0.0
1      1:00 - 1:59      0.0
2      2:00 - 2:59      0.0 
3      3:00 - 3:59      0.0

...etc

OR

the hour_display can be it's own column like this:

    hour  counted   hour_display
0      0      0.0   0:00 - 0:59
1      1      0.0   1:00 - 1:59 
2      2      0.0   2:00 - 2:59

Here's what I tired:

#make an empty df for edited values
df5 = pd.DataFrame(columns=['hour_display'])

for x in df4['hour']:
    x = str(x) + ":00 - " + str(x) + ":59"
    print(x) #for test
    df5.append([{'hour_display': x}], ignore_index=True )

print(df5) #for test
df4.append(df5)

The strange thing is, when I print(x) inside the for loop, it does show the values that I need. But when I try to print(df5), the dataframe is empty. So I can't connect df4 and df5.

Upvotes: 1

Views: 728

Answers (3)

Shubham Sharma
Shubham Sharma

Reputation: 71689

No need to use for loop to iterate over the values in the hour column, you can simply concatenate the columns after adding the desired suffix:

s = df4['hour'].astype(str)
df4['hour_display'] = s.add(':00') + '-' + s.add(':59')

    hour  counted hour_display
0      0      0.0    0:00-0:59
1      1      0.0    1:00-1:59
2      2      0.0    2:00-2:59
3      3      0.0    3:00-3:59
...

Upvotes: 2

KJDII
KJDII

Reputation: 861

This might work for you. Instead of creating a new df5, just add a new column with the desired values. Then just select the columns you'd like to display in Dash.

I didn't run this code and there could be some syntax issues but it should help point you in the right direction.

def hour_disply(x):
    return str(x) + ":00 - " + str(x) + ":59"

df4['hour_display'] = df4['hour'].apply(lambda x: hour_disply(x))

Upvotes: 2

Vichtor
Vichtor

Reputation: 197

This would not be the best of code but it should do the trick

Hours = []

for index,rows in df4.iterrows():

    x = str(df4.hour.iloc[index]) + ":00 - " + str(df4.hour.iloc[index]) + ":59"
    print(x) #for test
    Hours.append(x)

Final = pd.DataFrame([Hours,df4["counted"].values]).T
Final.columns = ["hour_display","counted"]

Upvotes: 1

Related Questions