Reputation:
import itertools, numpy as np
a = [1,2,3,4,5]
b = [5,2,3,6,7]
c = [5,2,3,8,9]
get most frequent numbers:
data = np.array([a,b,c]).flatten()
print (data)
values, counts = np.unique(data, return_counts=True)
for value, frequency in zip(values, counts):
print (value, frequency)
How can I get most frequent two consecutive numbers? Answer is [2,3]. But how to get it by program?
Upvotes: 0
Views: 223
Reputation: 46849
you could use collections.Counter
and iterate over data
in consecutive pairs:
import numpy as np
from collections import Counter
a = [1,2,3,4,5]
b = [5,2,3,6,7]
c = [5,2,3,8,9]
data = np.array([a,b,c]).flatten()
c = Counter(zip(data, data[1:]))
print(c.most_common(1))
# [((2, 3), 3)]
telling you that (2, 3)
occurred 3 times.
a bit more detail:
data[1:]
is your data
without its first element.
zip(data, data[1:])
zip
is then used to generate the consecutive pairs (as tuples
)
(1, 2), (2, 3), (3, 4), (4, 5), (5, 5), (5, 2), (2, 3), ...
the Counter
then just counts how many times the appear and stores them dict
-like:
Counter({(2, 3): 3, (5, 2): 2, (1, 2): 1, (3, 4): 1, (4, 5): 1, (5, 5): 1, (3, 6): 1,
(6, 7): 1, (7, 5): 1, (3, 8): 1, (8, 9): 1})
update: if you do not want pairs from different list, you can do this:
data = (a, b, c)
c = Counter()
for d in data:
c.update(zip(d, d[1:]))
print(c)
or directly:
c = Counter(pair for d in data for pair in zip(d, d[1:]))
Upvotes: 3
Reputation: 5696
You can use Counter
as suggested by @hiro protagonist, but since you want to treat a one row at a time, you have to apply it along rows.
from collections import Counter
Apply along rows using numpy:
data = np.array([a,b,c])
np.apply_along_axis(lambda x: Counter(zip(x, x[1:])), 1, data).sum().most_common(1)
[((2, 3), 3)]
Or, if using pandas:
import pandas as pd
data = np.array([a,b,c])
df = pd.DataFrame(data)
Now, apply Counter along rows:
df.apply(lambda x: Counter(zip(x, x[1:])), axis = 1).sum().most_common(1)
[((2, 3), 3)]
Upvotes: 0