Reputation: 49
The code:
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
from apyori import apriori
dataset = [['egg','bread'],['milk'],['apple','milk'],['diapers'],['orange','egg','milk']]
te = TransactionEncoder()
te_ary = te.fit(dataset).transform(dataset)
final_df = pd.DataFrame(te_ary, columns=te.columns_)
print(final_df)
frq_itemsets= apriori(final_df, min_support=0.5, use_colnames=True)
association_results = list(frq_itemsets)
print(association_results)
The output:
apple bread china egg embroidery milk
0 False True False True False False
1 False False False False False True
2 True False False False False True
3 False False False False True False
4 False False True True False True
[RelationRecord(items=frozenset({'a'}), support=0.5, ordered_statistics=[OrderedStatistic(items_base=frozenset(), items_add=frozenset({'a'}), confidence=0.5, lift=1.0)]), RelationRecord(items=frozenset({'e'}), support=0.6666666666666666, ordered_statistics=[OrderedStatistic(items_base=frozenset(), items_add=frozenset({'e'}), confidence=0.6666666666666666, lift=1.0)]), RelationRecord(items=frozenset({'i'}), support=0.5, ordered_statistics=[OrderedStatistic(items_base=frozenset(), items_add=frozenset({'i'}), confidence=0.5, lift=1.0)])]
What am I doing wrong?? I've searched everywhere on SO but I cant seem to find a question like this.
Thanks in advance. I hope it's not a stupid question. Can anyone help?
Upvotes: 1
Views: 562
Reputation: 11
I ran into this same problem! For me, the solution was one-hot encoding the DF. In easiest terms, depending on your data set, this means converting it into a list.
df = df.astype(str)
str_df = df.values.tolist()
te_ary = te.fit(str_list).transform(str_list)
That fixed it for me!
Upvotes: 1
Reputation: 29635
I believe there is a misuse of apriori
depending on from which package you get it. See below the difference
import pandas as pd
from mlxtend.preprocessing import TransactionEncoder
dataset = [['egg','bread'],['milk'],['apple','milk'],
['diapers'],['orange','egg','milk']]
te = TransactionEncoder()
te_ary = te.fit(dataset).transform(dataset)
final_df = pd.DataFrame(te_ary, columns=te.columns_)
print(final_df)
from mlxtend.frequent_patterns import apriori
# this method returns a dataframe, no need to use a list
df_freq = apriori(final_df, min_support=0.5, use_colnames=True)
print(df_freq)
# support itemsets
# 0 0.6 (milk)
from apyori import apriori
# this method returns a generator hence the use of list to get the result
print(list(apriori(dataset, min_support=0.5, )))
# [RelationRecord(items=frozenset({'milk'}), support=0.6,
# ordered_statistics=[OrderedStatistic(items_base=frozenset(),
# items_add=frozenset({'milk'}),
# confidence=0.6, lift=1.0)])]
Upvotes: 1