Reputation: 87
I have data (from a space delimited text file with two columns) which is already binned but only a width of 1. I want to increase this width to about 5. How can I do this using numpy/matplotlib in Python?
Using,
data = loadtxt('file.txt')
x = data[:, 0]
y = data[:, 1]
plt.bar(x,y)
creates too many bars and using,
plt.hist(data)
doesn't plot the histogram appropriately. I guess I don't understand how matplotlib's histogram plotting works.
See some of the data below.
264 1
265 1
266 4
267 2
268 2
269 2
270 2
271 2
272 5
273 3
274 2
275 6
276 7
277 3
278 7
279 5
280 9
281 4
282 8
283 11
284 9
285 15
286 19
287 11
288 12
289 10
290 13
291 18
292 20
293 14
294 15
Upvotes: 5
Views: 11918
Reputation: 3891
I am not an expert at matplotlib but I find hist
to be incredibly useful. The examples on the matplotlib site give a great overview of some of the features.
I don't know how to use your provided sample data without transforming it. I altered your example to dequantize those data before creating a histogram.
I calculated the bin size using this question's first answer.
import matplotlib.pyplot as plt
import numpy as np
data = np.loadtxt('file.txt')
dequantized = data[:,0].repeat(data[:,1].astype(int))
dequantized[0:7]
# Each row's first column is repeated the number of times found in the
# second column creating a single array.
# array([ 264., 265., 266., 266., 266., 266., 267.])
def bins(xmin, xmax, binwidth, padding):
# Returns an array of integers which can be used to represent bins
return np.arange(
xmin - (xmin % binwidth) - padding,
xmax + binwidth + padding,
binwidth)
histbins = bins(min(dequantized), max(dequantized), 5, 5)
plt.figure(1)
plt.hist(dequantized, histbins)
plt.show()
This histogram displayed looks like this.
I hope this example is useful.
Upvotes: 1
Reputation: 86326
What if you use numpy.reshape
to transform your data before using plt.bar
, for example:
In [83]: import numpy as np
In [84]: import matplotlib.pyplot as plt
In [85]: data = np.array([[1,2,3,4,5,6], [4,3,8,9,1,2]]).T
In [86]: data
Out[86]:
array([[1, 4],
[2, 3],
[3, 8],
[4, 9],
[5, 1],
[6, 2]])
In [87]: y = data[:,1].reshape(-1,2).sum(axis=1)
In [89]: y
Out[89]: array([ 7, 17, 3])
In [91]: x = data[:,0].reshape(-1,2).mean(axis=1)
In [92]: x
Out[92]: array([ 1.5, 3.5, 5.5])
In [96]: plt.bar(x, y)
Out[96]: <Container object of 3 artists>
In [97]: plt.show()
Upvotes: 1