datanewbie96
datanewbie96

Reputation: 67

Calculating conditional probability from list

Trying to calculate the conditional probability from a given list. Say I have the following list:

[[ 1, 0, 0, 0, 1, 5],
 [ 0, 1, 0, 1, 0, 3],
 [ 1, 0, 0, 0, 1, 5],
 [ 0, 0, 1, 1, 0, 2],
 [ 0, 0, 1, 0, 1, 1]]

Each 'column' represents a binary attribute, the last 'column' is the class attribute. To find conditional probability of an attribute, I need to calculate P(X|Y).

In Python list, how can I

  1. count the number of frequency of the attribute given it is a Y class?
  2. count the total frequency for the class?

The above is easily doable in pandas, but I am actually clueless on how to tackle it with a Python list.

Upvotes: 1

Views: 704

Answers (2)

user2390182
user2390182

Reputation: 73470

You could build a data structure along these lines:

from collections import defaultdict

d = defaultdict(lambda: defaultdict(lambda: [0, 0]))

for *values, key in matrix:
    for i, v in enumerate(values):
        d[key][i][v] += 1

And calculate the conditional probability like so:

def prob(k, i):
    false, true = d[k][i]  # counts of vals 0/1 in col i for class k
    return true / (true + false)

>>> prob(5, 3)  # for class 5, column 3 is this likely to be 1
0.0
>>> prob(5, 4)
1.0

Upvotes: 2

Mustafa Aydın
Mustafa Aydın

Reputation: 18306

frequency of the attribute given it is a Y class?

class_ = 3
attr_index = 1
attr_freq_given_cls = sum(a_list[attr_index]
                          for a_list in list_of_lists
                          if a_list[-1] == class_)

Since attributes are from {0, 1}, summing yields the number of occurences; and indexing with -1 gives the label.

count the total frequency for the class?

from collections import Counter
class_freqs = Counter(a_list[-1] for a_list in list_of_lists)

good luck with that naive Bayes :)

Upvotes: 1

Related Questions