Reputation: 18647
I have the following Python dict
:
d = {'A-x': 1, 'A-y': 2,
'B-x': 3,
'C-x': 4, 'C-y': 5,
'D-x': 6, 'D-y': 7,
'E-x': 8}
Where the keys here represent a "Level-SubLevel
" pattern.
There is no B-y
or E-y
key and they can therefor be considered "missing".
I'm trying to detect these "missing" key Levels, so my expected output would be the list
:
['B', 'E']
So far I have the following working solution...
import numpy as np
from itertools import product
a = np.array([k.split('-') for k in d])
all_keys = ['-'.join(x) for x in list(product(set(a[:, 0]), set(a[:, 1])))]
missing_keys = [x.split('-')[0] for x in all_keys - d.keys()]
... but I feel there must be a better/cleaner solution - ideally using the standard python library.
I should clarify also, that in this particular case, the "SubLevel" portion of the key can only be 1 of 2 possible values. So only "x"
or "y"
. Also "...-x"
will always exist, it's only possible that "...-y"
may be missing.
Any suggestions would be much appreciated.
Upvotes: 2
Views: 1450
Reputation: 195553
After clarifying in your question, that only '-y'
keys might be missing, you can try this:
d = {'A-x': 1, 'A-y': 2,
'B-x': 3,
'C-x': 4, 'C-y': 5,
'D-x': 6, 'D-y': 7,
'E-x': 8}
out = [k for k in set(k.split('-')[0] for k in d) if not k+'-y' in d]
print(out)
Prints:
['B', 'E']
Upvotes: 2
Reputation: 2427
I guess that the most efficient (and compact) solution would be to use groupby
from itertools
:
from itertools import groupby
groups = [[key, len(list(val))] for key,val in groupby(d, lambda x: x[0])]
m = max(item[1] for item in groups)
missing = [item[0] for item in groups if item[1] < m]
Result:
missing --> ['B', 'E']
Upvotes: 1
Reputation: 11
Here is a solution that will tell you missing y-values like in your example. However you should give more clarification for other cases if you're expecting it to behave otherwise.
for (k, _) in d.items():
if k.split('-')[0]+'-y' not in d:
missing.append(k.split('-')[0])
Hope this helps
Upvotes: 1
Reputation: 2086
Check out below solution:
lst = []
[lst.append(i.split("-")[0]) for i in list(d.keys())]
for i in set(lst):
countChar = lst.count(i)
if countChar == 1:
print(i)
Upvotes: 1
Reputation: 2528
Only using standard python library functionality I can provide this solution:
# Generate list of list of section/subsection pairs
a = [k.split('-') for k in d.keys()]
# Generate set of sections
sec = set([x[0] for x in a])
# {'A', 'D', 'C', 'B', 'E'}
# Generate set of subsections
subsec = set([x[1] for x in a])
# {'y', 'x'}
# Find missing keys by checking all combinations (saving only the section)
missing_keys = [s for s in sec for ss in subsec if [s, ss] not in a]
# ['B', 'E']
Upvotes: 2
Reputation: 54293
You don't need either numpy
or itertools
:
d = {'A-x': 1, 'A-y': 2,
'B-x': 3,
'C-x': 4, 'C-y': 5,
'D-x': 6, 'D-y': 7,
'E-x': 8}
first_letters = set(k.split('-')[0] for k in d)
# {'A', 'B', 'C', 'D', 'E'}
second_letters = set(k.split('-')[1] for k in d)
# {'x', 'y'}
all_keys = [f'{first_letter}-{second_letter}' for first_letter in first_letters
for second_letter in second_letters]
# ['A-y', 'A-x', 'E-y', 'E-x', 'B-y', 'B-x', 'C-y', 'C-x', 'D-y', 'D-x']
missing_keys = set(x.split('-')[0] for x in all_keys - d.keys())
# {'B', 'E'}
Note that missing_keys
are not unique (try with 'F-z'
) so I took the liberty to convert it to a set.
Upvotes: 2
Reputation: 312370
Without using numpy
, you can get your all_keys list doing something like this:
all_keys = ['-'.join(x) for x in product(
set(y.split('-')[0] for y in d.keys()),
set(z.split('-')[1] for z in d.keys())
)]
Everything else remains the same. It's not any "cleaner", but it avoids pulling in all of numpy
for a relatively simple task.
Upvotes: 2