Reputation: 1935
to read data from a mysql table , and use numpy to transfer the data to numpy array, the data in the mysql table include varchar(128),int, bigint,float, therefore, I think I may read these data all as string type at first, try using numpy.fromiter:
select_sql = "select * from fb_web_active_group_members_user_mbkmeansclustering_ng_six_test"
count = cur.execute(select_sql)
if count:
user_level_cluster_data = cur.fetchall()
user_level_cluster_data_df = numpy.fromiter(user_level_cluster_data,dtype = numpy.str,count = -1)
but it errors:
File "F:/MyDocument/F/My Document/Training/Python/PyCharmProject/FaceBookCrawl/FB_group_user_stability.py", line 21, in get_pre_new_user_level_data
user_level_cluster_data_df = numpy.fromiter(user_level_cluster_data,dtype = numpy.str,count = -1)
ValueError: Must specify length when using variable-size data-type.
could you please tell me the reason and how to resolve it, if I want read all the data from the mysql table as their own data types(not read them all as string type at first), such as: the varchar(128) data as string, int type as int, float type as float....how I should do
Upvotes: 1
Views: 3855
Reputation: 249444
dtype
needs to be the entire, full dtype for a whole record. Your current error occurs because NumPy strings are fixed-capacity, so you'd need to say dtype='S128'
for example, to get strings up to 128 characters in capacity. But your actual dtype probably consists of several columns, so you might want something like this:
dtype=[('colA', 'i4'), ('colB', 'f8'), ('colC', 'S128')]
Also note that fromiter()
may not be helping you, since you're using fetchall()
which I think returns a list anyway. You can simply do:
np.array(user_level_cluster_data, dtype)
Or if you want to use fromiter()
, you should pass it the count
parameter and use lazy fetching instead of fetchall()
.
Upvotes: 1