Reputation: 11
I am new to python & pandas , i have an issue when i populate dynamically df2 (dataframes) it is not giving proper results on the unique count. I have product file where Product_ID , VendorID is the key (composite key)
df=pd.read_csv("Product_file.csv")
df2 = df['Product_ID'] + df['VendorID']
print(df2)
print('type....' , type(df2))
uniq = df2.unique().size
print('Uniue values ' , str(uniq)) >>> getting the result as 9 , it is expected .
however when i populate dynamically by reading from the table (sqlite3) which has PK as one of the column which is getting in to index 4
Index =4 will look like below
PK-KEY(Table-column)
Product_ID, VendorID
Store_no, Loc_no
Dist_no
result_list = row[4].strip().split(',')
#print('length of list :', len(result_list))
values = 0
unique_key = ' '
overall_key = ' '
values = range(len(result_list))
index = 0
i = 1
for index in values:
#print('index value >>> ', i)
if i == len(result_list):
df_concat = df_concat + 'df['+ "'" + result_list[index] + "'" + ']'
break
elif (i < len(result_list)):
df_concat = df_concat + ' df['+ "'" + result_list[index] + "'" + ']' + ' +'
i = i + 1
print(df_concat)
#print(type(df_concat))
df_strip = df_concat.strip(' ')
print('df_strip ->>>>>' , df_strip)
df_Series = pd.Series(df_strip)
unique_key = df_Series.unique().size
overall_key = df_Series.count()
Getting value as 1 , not sure how to resolve it .Your help is very much appreciated.
Upvotes: 0
Views: 46
Reputation: 11
I got the answer by searching thru stack over flow , PK keys stored in the result_list (list) type and i have used the .apply method to get the counts.
df['Combined'] = df[result_list].apply(lambda row: '_'.join(row.values.astype(str)), axis=1)
print('combined values ======= ', df['Combined'])
overall_count = len(df['Combined'])
unique_count = len(df['Combined'].unique())
Upvotes: 1