Reputation: 73
I was trying to split combination of string, unicode in python. The split has to be made on the ResultSet object retrieved from web-site. Using the code below, I am able to get the details, actually it is user details:
from bs4 import BeautifulSoup
import urllib2
import re
url = "http://www.mouthshut.com/vinay_beriwal"
profile_user = urllib2.urlopen(url)
profile_soup = BeautifulSoup(profile_user.read())
usr_dtls = profile_soup.find("div",id=re.compile("_divAboutMe")).find_all('p')
for dt in usr_dtls:
usr_dtls = " ".join(dt.text.split())
print(usr_dtls)
The output is as below:
i love yellow..
Name: Vinay Beriwal
Age: 39 years
Hometown: New Delhi, India
Country: India
Member since: Feb 11, 2016
What I need is to create distinct 5 variables as Name, Age, Hometown, Country, Member since and store the corresponding value after ':' for same.
Thanks
Upvotes: 0
Views: 310
Reputation: 989
You can use a dictionary to store name-value pairs.For example -
my_dict = {"Name":"Vinay","Age":21}
In my_dict
, Name
and Age
are the keys of the dictionary, you can access values like this -
print (my_dict["Name"]) #This will print Vinay
Also, it's nice and better to use complete words for variable names.
results = profile_soup.find("div",id=re.compile("_divAboutMe")).find_all('p')
user_data={} #dictionary initialization
for result in results:
result = " ".join(result.text.split())
try:
var,value = result.strip().split(':')
user_data[var.strip()]=value.strip()
except:
pass
#If you print the user_data now
print (user_data)
'''
This is what it'll print
{'Age': ' 39 years', 'Country': ' India', 'Hometown': 'New Delhi, India', 'Name': 'Vinay Beriwal', 'Member since': 'Feb 11, 2016'}
'''
Upvotes: 2
Reputation: 19806
You can use a dictionary to store your data:
my_dict = {}
for dt in usr_dtls:
item = " ".join(dt.text.split())
try:
if ':' in item:
k, v = item.split(':')
my_dict[k.strip()] = v.strip()
except:
pass
Note: You should not use usr_dtls
inside your for
loop, because that's would override your original usr_dtls
Upvotes: 0