Reputation: 1460
I wanted to extract the value of a column given another column with id's of a different dataset.
DF-1:
ID A B
1 cat 22
2 dog 33
3 mamal 44
4 rat 55
5 rabbit 66
6 puppy 77
DF-2:
name fav_animal
x 1,2,3
y 2,3
z 3,4
I wanted to see the fav animals of x in a new list say name_animal.
code:
#storing all the id's of x frist
list_id = []
name_animal = []
for i in list_ids:
name_animal.append(df1.loc[df1.id == i, 'A'].values.to_list()
Output:
list_id = [1,2,3]
name_animal = ['cat','dog','mamal']
Upvotes: 1
Views: 74
Reputation: 1824
Something like this?
for i in df2.fav_animal.tolist():
print(df1.loc[map(int, i.split(","))]["A"].tolist())
Output:
['dog', 'mamal', 'rat']
['mamal', 'rat']
['rat', 'rabbit']
Alternative:
print([df1.loc[map(int, i.split(","))]["A"].tolist() for i in df2.fav_animal.tolist()])
Output:
[['dog', 'mamal', 'rat'], ['mamal', 'rat'], ['rat', 'rabbit']]
Upvotes: 0
Reputation: 4792
I think what you're looking for is this:
df1 = pd.DataFrame({'ID':np.arange(1, 7),
'A': ['cat', 'dog', 'mamal', 'rat', 'rabbit', 'puppy'],
'B': [22, 33, 44, 55, 66, 77]})
df2 = pd.DataFrame({'name': ['x', 'y', 'z'],
'fav_animal': ['1,2,3', '2,3', '3,4']})
df2.loc[df2.name == 'x', 'fav_animal'].str.split(',')[0]
['1', '2', '3']
Returns a list of strings. So you need to convert values to integers using map function.
mask = map(int, df2.loc[df2.name == 'x', 'fav_animal'].str.split(',')[0])
df1.loc[df1.ID.isin(mask), 'A'].values.tolist()
>['cat', 'dog', 'mamal']
Upvotes: 1
Reputation: 862681
First check find fav_animal
values with boolean indexing
, next
and iter
is for return empty list if no name matched.
a = next(iter(df2.loc[df2['name'] == 'x', 'fav_animal']), [])
Then split values and convert them to integers:
list_id = list(map(int, a.split(',')))
print (list_id)
[1, 2, 3]
And last filter by isin
first DataFrame
:
name_animal = df1.loc[df1.ID.isin(list_id), 'A'].values.tolist()
print (name_animal)
['cat', 'dog', 'mamal']
Upvotes: 2
Reputation: 88236
You can use this function for example:
def get_names(df, df2, name):
indices = np.asarray(df2.loc[name].values[0].split(',')).astype(int)
return indices.tolist(), df.loc[indices,:]['A'].tolist()
So, for example if you want the fav_animals
for name x
:
list_id, name_animal = get_names(df,df2, 'x')
print(list_id)
[1, 2, 3]
print(name_animal)
['dog', 'mamal', 'rat']
Upvotes: 1