user2085779
user2085779

Reputation:

Error with matplotlib when used with Unicode strings

I have text file containing Unicode strings and their frequencies.

അംഗങ്ങള്‍ക്ക്    10813
കുടുംബശ്രീ   10805
പരിരക്ഷാപദ്ധതിക്ക്   10778
ചെയ്തു   10718
ഇന്ന്‌   10716
അന്തര്‍     659
രാജിന്റെ    586 

When I try to plot it using matplotlib

I am getting this error

Traceback (most recent call last):
  File "plot.py", line 3, in <module>
    xs, ys = np.loadtxt('oun.txt', delimiter='\t').T
  File "/usr/local/lib/python2.7/dist-packages/numpy/lib/npyio.py", line 841, in loadtxt
    items = [conv(val) for (conv, val) in zip(converters, vals)]
ValueError: could not convert string to float: ' 

This is the code I have

import numpy as np
import matplotlib.pyplot as plt
xs, ys = np.loadtxt('oun.txt', delimiter='\t').T
plt.bar(xs, ys)
plt.show()

Whats wrong with this code ?

Upvotes: 0

Views: 239

Answers (1)

Ffisegydd
Ffisegydd

Reputation: 53718

In order to read strings from a file using loadtxt you have to specify the dtype argument (see docs here).

import matplotlib.pyplot as plt
import numpy as np

data = np.loadtxt('derp', dtype={'names': ('strings', 'freq'),
                                   'formats': ('S32', 'i4')})

xs, ys = zip(*data)
temp = range(len(ys)) # Temp variable for use as x-axis.

plt.bar(temp, ys, align='center')
plt.xticks(temp, xs) # Re-define ticks as your strings.

plt.show()

In this case the file has 2 columns, I've given them the names ('strings', 'freq') and the formats are ('S32', 'i4') where S denotes a string and i denotes an integer. The docs for dtype can be found here. Note that the numbers within the dtype formatting give information on the size of the values in your columns (i4 corresponds to a 32-bit signed integer for example).

Upvotes: 3

Related Questions