Reputation: 1
I have one dataframe called df_companies.
Òutput:
company brand
0 VW-Konzern volkswagen
1 VW-Konzern audi
2 VW-Konzern bentley
3 VW-Konzern bugatti
4 VW-Konzern lamborghini
In the next step i receive two strings within a for loop through some string formatting. After that i am trying to check, if the 'companyName' string is contained in the column 'brand' of the dataframe 'df_companies'.
If so, then add the logo_url string two the dataframe df_companies into column 'image_url'.
for image in images:
companyName = image['alt'].lower().split(' ', 1)[0]
logo_url = image['src']
df_companies['image_url'] = np.where(df_companies['brand'].str.contains(companyName), logo_url, 'other')
This works so far for the first row. For the rest of the remaining rows it only enters the string 'other' as defined above.
Output:
company brand image_url
0 VW-Konzern volkswagen https://imgr.volkswagen.png
1 VW-Konzern audi Other
2 VW-Konzern bentley Other
3 VW-Konzern bugatti Other
4 VW-Konzern lamborghini Other
What i would like to achieve is the following output:
Output:
company brand image_url
0 VW-Konzern volkswagen https://imgr.volkswagen.png
1 VW-Konzern audi https://imgr.audi.png
2 VW-Konzern bentley https://imgr.bentley.png
3 VW-Konzern bugatti https://imgr.audi.png
4 VW-Konzern lamborghini https://imgr.audi.png
The issue is that the 'companyName' string and the string in the 'brand' column of dataframe 'df_companies' only match partially, why i cannot use the common merge function.
Any ideas how to solve that issue?
Thanks in advance for any help!
Upvotes: 0
Views: 54
Reputation: 1
I could find a solution. I had to put in df_companies['image_url']
into the else condition of np.where
insteaf of 'other'
.
df_companies['image_url'] = ''
for image in images:
companyName = image['alt'].lower().split(' ', 1)[0]
logo_url = image['src']
df_companies['image_url'] = np.where(df_companies['brand'].str.contains(companyName), logo_url, df_companies['image_url'],)
Afterwards i got the expected output:
Output:
company brand image_url
0 VW-Konzern volkswagen https://imgr1.auto-motor-und-sport.de/Volkswagen-Logo-brandtreeIndexSmall-f4ae98d2-941118.jpg
1 VW-Konzern audi https://imgr1.auto-motor-und-sport.de/Audi-Logo-Neu-02-2019-brandtreeIndexSmall-44d434cb-1430808.jpg
2 VW-Konzern bentley https://imgr1.auto-motor-und-sport.de/Bentley-Logo-brandtreeIndexSmall-41a61902-702423.jpg
3 VW-Konzern bugatti https://imgr1.auto-motor-und-sport.de/Bugatti-Logo-brandtreeIndexSmall-252888c-946200.jpg
4 VW-Konzern lamborghini https://imgr1.auto-motor-und-sport.de/Lamborghini-Logo-brandtreeIndexSmall-df88a8a3-945695.jpg
Upvotes: 0
Reputation: 8033
You can use regex=True
within the str.contains
method & even a partial name should match. For example as below
df['image_url'] = np.where(df['brand'].str.contains('au' , regex=True), 'logo_url', 'other')
Output I just used text logo_url
instead of the url.
company brand image_url
VW-Konzern volkswagen other
VW-Konzern audi logo_url
VW-Konzern bentley other
VW-Konzern bugatti other
VW-Konzern lamborghini other
Upvotes: 1