Reputation: 69
I'm trying to do some AWS pricing analysis using Pandas, and it involves bringing EC2 pricing data into a df using their API. Unfortunately, the dataset is headed by 5 rows and 2 columns of descriptors before the useful data starts (see image). This causes an error when my code encounters the start of the useful data, which has 51 columns.
How can I tell it to ignore the first 5 rows, and to treat the 6th row as my column headers?
Here's where I'm at:
import pandas as pd
import requests
import io
pricing_url = "https://pricing.us-east-1.amazonaws.com/offers/v1.0/aws/AmazonEC2/current/index.csv"
pricing_r = requests.get(pricing_url).content
pricing = pd.read_csv(io.StringIO(pricing_r.decode('utf-8')))
ParserError: Error tokenizing data. C error: Expected 2 fields in line 6, saw 51
Upvotes: 2
Views: 399