Reputation: 185
Load in these CSV files from the Sean Lahman's Baseball Database. For this assignment, we will use the 'Salaries.csv' and 'Teams.csv' tables. Read these tables into a pandas DataFrame and show the head of each table.
#Here's the code I have so far:
import requests
import io
import zipfile
url = 'http://seanlahman.com/files/database/lahman-csv_2014-02-14.zip
r = requests.get(url,auth=('user','pass'))
#These were lines of code I looked up but am not sure to use:
#with zipfile.ZipFile('/path/to/file', 'r') as z:
#f = z.open('member.csv')
#table = pd.io.parsers.read_table(f, ...)
#salariesData = pd.read_csv('Salaries.csv')
#teamsData = pd.read_csv('Teams.csv')
Upvotes: 0
Views: 1709
Reputation: 874
Request returns a bytes file, so first convert bytes to zip file:
mlz = zipfile.ZipFile(io.BytesIO(r.content))
To see what's in the zipfile, type:
mlz.namelist()
Then you can extract and read the CSV corresponding to the index, x:
df1 = pd.read_csv(mlz.open(mlz.namelist()[0]))
df2 = pd.read_csv(mlz.open(mlz.namelist()[1]))
In your specific case, this will likely be:
salariesData = pd.read_csv(mlz.open('Salaries.csv'))
teamsData = pd.read_csv(mlz.open('Teams.csv'))
(All of this ^ assumes you're using Python 3.x)
Upvotes: 3