Reputation: 67
Trying to calculate the conditional probability from a given list. Say I have the following list:
[[ 1, 0, 0, 0, 1, 5],
[ 0, 1, 0, 1, 0, 3],
[ 1, 0, 0, 0, 1, 5],
[ 0, 0, 1, 1, 0, 2],
[ 0, 0, 1, 0, 1, 1]]
Each 'column' represents a binary attribute, the last 'column' is the class attribute. To find conditional probability of an attribute, I need to calculate P(X|Y)
.
In Python list, how can I
The above is easily doable in pandas, but I am actually clueless on how to tackle it with a Python list.
Upvotes: 1
Views: 704
Reputation: 73470
You could build a data structure along these lines:
from collections import defaultdict
d = defaultdict(lambda: defaultdict(lambda: [0, 0]))
for *values, key in matrix:
for i, v in enumerate(values):
d[key][i][v] += 1
And calculate the conditional probability like so:
def prob(k, i):
false, true = d[k][i] # counts of vals 0/1 in col i for class k
return true / (true + false)
>>> prob(5, 3) # for class 5, column 3 is this likely to be 1
0.0
>>> prob(5, 4)
1.0
Upvotes: 2
Reputation: 18306
frequency of the attribute given it is a Y class?
class_ = 3
attr_index = 1
attr_freq_given_cls = sum(a_list[attr_index]
for a_list in list_of_lists
if a_list[-1] == class_)
Since attributes are from {0, 1}
, sum
ming yields the number of occurences; and indexing with -1
gives the label.
count the total frequency for the class?
from collections import Counter
class_freqs = Counter(a_list[-1] for a_list in list_of_lists)
good luck with that naive Bayes :)
Upvotes: 1