Reputation: 21
i am new to matploblib and numpy and have faced issues trying to extracting the data. the following codes results in IndexError: boolean index did not match indexed array along dimension 0; dimension is 32 but corresponding boolean dimension is 112. pls advise!!
dataset used: https://data.gov.sg/dataset/monthly-motor-vehicle-population-by-type-of-fuel-used
import numpy as np
import matplotlib.pyplot as plt
title = "motor-vehicle-population-statistics-by-type-of-fuel-used."
titlelen = len(title)
print("{:*^{titlelen}}".format(title, titlelen=titlelen+6))
print()
data = np.genfromtxt("data/motor-vehicle-population-statistics-by-type-of-fuel-used.csv",
dtype=("datetime64[Y]","U100","U110",int),
delimiter=",",
names=True)
years = np.unique(data["month"])
category = np.unique(data['category'])
type = np.unique(data['type'])
cars = data[data["category"]=="Cars"]["number"]
carspetrol = cars[data["type"]=="Petrol"]["number"]
# print(cars)
print(carspetrol)
Upvotes: 1
Views: 8404
Reputation: 4130
you have few issues here..
first one don't use python keywords as variables change this
type = np.unique(data['type'])
to this
types = np.unique(data['type'])
your error is you are trying to compare a boolean array which has 112 values (data
) with a 32-elements array (cars
). So your code should change like this
cars = data[data["category"]=="Cars"]
carspetrol = cars[cars["type"]=="Petrol"]["number"]
furthermore, it is better to use analytical library like Pandas to do the basic analytics than numpy.
Upvotes: 2