theduker
theduker

Reputation: 21

Reading a csv from url

I'm trying to get the first chunk listed below to replicate the second chunk. However, I've noticed in jupyter that when I try to get to the same table; it displays it differently (the first chunk looks like a nice dataframe and the second just looks like a plain table). Is there a difference between the two methods? The other thing I have also noticed is that the first method the column 'cases' comes out in dtypes while in the second chunk it displays something different. Thanks!

url = 'https://raw.githubusercontent.com/nytimes/covid-19-data/master/us-counties.csv'
states = pd.read_csv(url, 
                     usecols=['date', 'county', 'state', 'cases'],
                     parse_dates=['date'],
                     squeeze=True
                    ).sort_index()
states = states.loc[states['state'] == 'Alabama']
states = states.drop(columns=['state'])
states.set_index(['county', 'date'], inplace=True)
states.dtypes

cases int64

dtype: object

url = 'https://covidtracking.com/api/v1/states/daily.csv'
states = pd.read_csv(url,
                     usecols=['date', 'state', 'positive'],
                     parse_dates=['date'],
                     index_col=['state', 'date'],
                     squeeze=True).sort_index()

states.dtypes

dtype('float64')

Upvotes: 0

Views: 83

Answers (1)

Eric Truett
Eric Truett

Reputation: 3010

In the second dataframe, there are NaN values, which forces a conversion to float. For more details on this, see the documentation on the new nullable integer datatype.

Upvotes: 1

Related Questions