Costa.Gustavo
Costa.Gustavo

Reputation: 859

Update row after comparing values on pandas dataframe

I connect to an API that provides covid-19 data in Brazil organized by state and city, as follows:

#Bibliotecas
import pandas as pd
from pandas import Series, DataFrame, Panel
import matplotlib.pyplot as plt
from matplotlib.pyplot import plot_date, axis, show, gcf
import numpy as np
from urllib.request import Request, urlopen
import urllib
from http.cookiejar import CookieJar
import numpy as np
from datetime import datetime, timedelta

cj = CookieJar()

url_Bso = "https://brasil.io/api/dataset/covid19/caso_full/data?state=MG&city=Barroso"
req_Bso = urllib.request.Request(url_Bso, None, {"User-Agent": "python-urllib"})
opener_Bso = urllib.request.build_opener(urllib.request.HTTPCookieProcessor(cj))
response_Bso = opener_Bso.open(req_Bso)
raw_response_Bso = response_Bso.read()

json_Bso = pd.read_json(raw_response_Bso)
results_Bso = json_Bso['results']
results_Bso = results_Bso.to_dict().values()
df_Bso = pd.DataFrame(results_Bso)
df_Bso.head(5)

This Api compiles the data released by the state health departments. However, there is a difference between the records of the state and city health departments, and the state records are out of date in relation to those of the cities. I would like to update Thursdays and Saturdays (the day when the epidemiological week ends). I'm trying the following:

saturday = datetime.today() + timedelta(days=-5)
yesterday = datetime.today() + timedelta(days=-1)
last_available_confirmed_day_Bso_saturday = 51
last_available_confirmed_day_Bso_yesterday = 54
df_Bso = df_Bso.loc[df_Bso['date'] == saturday, ['last_available_confirmed']] = last_available_confirmed_day_Bso_saturday
df_Bso = df_Bso.loc[df_Bso['date'] == yesterday, ['last_available_confirmed']] = last_available_confirmed_day_Bso_yesterday
df_Bso

However, I get the error:

> AttributeError: 'int' object has no attribute 'loc'

I need another dataframe with the values of these days updates. Can anyone help?

Upvotes: 0

Views: 80

Answers (1)

Pramote Kuacharoen
Pramote Kuacharoen

Reputation: 1541

You have to adjust the date. Your data frame date column is a string. You can convert them to datetime.

today = datetime.now()

last_sat_num = (today.weekday() + 2) % 7
last_thu_num = (today.weekday() + 4) % 7

last_sat = today - timedelta(last_sat_num)
last_thu = today - timedelta(last_thu_num)
last_sat_str = last_sat.strftime('%Y-%m-%d')
last_thu_str = last_thu.strftime('%Y-%m-%d')

last_available_confirmed_day_Bso_sat = 51
last_available_confirmed_day_Bso_thu = 54

df_Bso2 = df_Bso.copy()
df_Bso2.loc[df_Bso2['date'] == last_sat_str, ['last_available_confirmed']] = last_available_confirmed_day_Bso_sat
df_Bso2.loc[df_Bso2['date'] == last_thu_str, ['last_available_confirmed']] = last_available_confirmed_day_Bso_thu

df_Bso2[['date', 'last_available_confirmed']].head(10)

Output

         date  last_available_confirmed
0  2020-07-15                        44
1  2020-07-14                        43
2  2020-07-13                        40
3  2020-07-12                        40
4  2020-07-11                        51
5  2020-07-10                        39
6  2020-07-09                        36
7  2020-07-08                        36
8  2020-07-07                        27
9  2020-07-06                        27

Upvotes: 2

Related Questions