bin
bin

Reputation: 1935

python numpy convert data errors

to read data from a mysql table , and use numpy to transfer the data to numpy array, the data in the mysql table include varchar(128),int, bigint,float, therefore, I think I may read these data all as string type at first, try using numpy.fromiter:

select_sql = "select * from fb_web_active_group_members_user_mbkmeansclustering_ng_six_test"
count = cur.execute(select_sql)
if count:
    user_level_cluster_data = cur.fetchall()
    user_level_cluster_data_df = numpy.fromiter(user_level_cluster_data,dtype = numpy.str,count = -1)

but it errors:

File "F:/MyDocument/F/My Document/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_stability.py", line 21, in get_pre_new_user_level_data
user_level_cluster_data_df = numpy.fromiter(user_level_cluster_data,dtype = numpy.str,count = -1)
ValueError: Must specify length when using variable-size data-type.

could you please tell me the reason and how to resolve it, if I want read all the data from the mysql table as their own data types(not read them all as string type at first), such as: the varchar(128) data as string, int type as int, float type as float....how I should do

Upvotes: 1

Views: 3855

Answers (1)

John Zwinck
John Zwinck

Reputation: 249444

dtype needs to be the entire, full dtype for a whole record. Your current error occurs because NumPy strings are fixed-capacity, so you'd need to say dtype='S128' for example, to get strings up to 128 characters in capacity. But your actual dtype probably consists of several columns, so you might want something like this:

dtype=[('colA', 'i4'), ('colB', 'f8'), ('colC', 'S128')]

Also note that fromiter() may not be helping you, since you're using fetchall() which I think returns a list anyway. You can simply do:

np.array(user_level_cluster_data, dtype)

Or if you want to use fromiter(), you should pass it the count parameter and use lazy fetching instead of fetchall().

Upvotes: 1

Related Questions