Adrian
Adrian

Reputation: 9793

How to find the mean of an occurrence in a list in Python

a = [[1.], [-1.], [1.]]

I have the above list. I want to find the value: (# of -1)/length of a. For the above example, the value is 1/3.

If we have

a = [[1.], [-1.], [1.], [-1.]]

then the value is 1/2.

How can I perform the above calculation in Python?

I've tried a.count(-1)/a.shape[0], but that did not seem to work for list objects.

Upvotes: 0

Views: 102

Answers (4)

thehumaneraser
thehumaneraser

Reputation: 620

Two things: first, your list contains list. Doing a.count(-1) is searching the list for the integer -1. Instead, you have to search your list for the list containing -1. like so: a.count([-1.]).

Second, if you are just using a regular list then it does not have a shape property. That is for ndarrays. Instead just use len(a).

So instead of using a.count(-1)/a.shape[0] you should use a.count([-1.])/len(a).

Edit: When a is an ndarray

This can be done quite easily in the case where a is in fact an ndarray, but will look different from the case where a is just a python list. In a simple case of a list of lists containing a single real number each, the solution by Andy L. works fine.

However, recall that ndarrays are basically matrices and checking equality as suggested by Andy L. (a == -1) will check element-wise equality over the entire matrix. The downside to this arises when you want to check the number of rows in an ndarray matching a given row of some length greater than 1 (note that the solution I suggested above will still work if you want to check the number of lists in a python list matching a given list of arbitrary length).

An example:

Suppose we have an array

a = np.array([
    [1, 2],
    [2, 2],
    [1, 3],
    [1, 2]
])

And we want to find the proportion of rows equal to [1, 2] (in this case .5). The solution proposed by Andy L. will not quite work in this case because if we try a == [1, 2] we will get the element-wise truth array:

[[True, True],
[False, True],
[True, False],
[True, True]]

Calling .mean() on this array will give us 6/8 = 0.75, not what we want. So we must add an extra step:

temp = (a == [1, 2]).all(axis=1)
proportion = np.sum(temp) / temp.shape[0]

Calling .all(axis=1) will reduce the array to a 1-dimensional array where each value is True if the corresponding row in a == [1, 2] was [True, True] and false otherwise. This will give us the desired result for checking row equality for rows of arbitrary length.

Upvotes: 1

Andy L.
Andy L.

Reputation: 25239

As you say a is numpy.ndarray, simply check on -1 and use mean to get your desired output

a = np.array([[1.], [-1.], [1.]])

In [1144]: (a == -1).mean()
Out[1144]: 0.3333333333333333

In [1146]: a = np.array([[1.], [-1.], [1.], [-1.]])

In [1147]: (a == -1).mean()
Out[1147]: 0.5

Upvotes: 0

Sebastien D
Sebastien D

Reputation: 4482

You may try:

a = [[1.], [-1.], [1.], [-1.]]
len([x[0] for x in a if x[0]==-1])/len(a)

Output

0.5

Upvotes: 0

Helios
Helios

Reputation: 715

a.count([-1.]) / len(a)

Note to be careful with float equality in general though

Upvotes: 0

Related Questions