Reputation: 797
Want to extract the tag and save the value "AAAAA" with pp-0,pp-1,pp-2,pp-3,pp-4 etc... and it corresponding value 1000, 1002, 1003, 1004 etc... in python dictionary format, which [d] should be the first level and "AAAAA" should be the second level start with [d] key and "AAAAA@X##" plus save all dictionary data in string format.
Coding
from bs4 import BeautifulSoup
html = '''<span id="AAAAA" style="display:none">
<span id="pp-0" style="display:none">1000</span>
<span id="pp-1" style="display:none">1001</span>
<span id="pp-2" style="display:none">1002</span>
<span id="pp-3" style="display:none">1003</span>
<span id="pp-4" style="display:none">1004</span>
<span id="pp-5" style="display:none">1005</span>
<span id="pp-6" style="display:none">1006</span>
<span id="pp-7" style="display:none">1007</span>
<span id="pp-8" style="display:none">1008</span>
<span id="pp-9" style="display:none">1009</span>
<span id="pp-10" style="display:none">1010</span>
<span id="pp-11" style="display:none">1011</span>
<span id="pp-12" style="display:none">1012</span>
<span id="pp-13" style="display:none">1013</span>
<span id="pp-14" style="display:none">1014</span>
<span id="pp-17" style="display:none">1015</span>
<span id="pp-27" style="display:none">1016</span>
</span>'''
soup = BeautifulSoup(html, 'html.parser')
elements = soup.find_all('span')
Wrong Output
[<span id="AAAAA" style="display:none">
<span id="pp-0" style="display:none">1000</span>
<span id="pp-1" style="display:none">1001</span>
<span id="pp-2" style="display:none">1002</span>
<span id="pp-3" style="display:none">1003</span>
<span id="pp-4" style="display:none">1004</span>
<span id="pp-5" style="display:none">1005</span>
<span id="pp-6" style="display:none">1006</span>
<span id="pp-7" style="display:none">1007</span>
<span id="pp-8" style="display:none">1008</span>
<span id="pp-9" style="display:none">1009</span>
<span id="pp-10" style="display:none">1010</span>
<span id="pp-11" style="display:none">1011</span>
<span id="pp-12" style="display:none">1012</span>
<span id="pp-13" style="display:none">1013</span>
<span id="pp-14" style="display:none">1014</span>
<span id="pp-17" style="display:none">1015</span>
<span id="pp-27" style="display:none">1016</span>.....]
Expected Output (in coding level)
{'d': 'AAAAA@X##{"pp-0": 1000, "pp-1":1001, "pp-2":1002, "pp-3": 1003, "pp-4": 1004, "pp-5": 1005, "pp-6": 1006, "pp-7": 1007, "pp-8": 1008, "pp-9": 1009, "pp-10": 1010, "pp-11": 1011, "pp-12": 1012, "pp-13": 1013, "pp-14": 1014, "pp-17": 1015, "pp-27": 1016}'}
Expected Output
{'d':'AAAAA@X##{"pp-0": 1000, "pp-1":1001, "pp-2":1002, "pp-3": 1003,
"pp-4": 1004, "pp-5": 1005, "pp-6": 1006, "pp-7": 1007, "pp-8": 1008,
"pp-9": 1009, "pp-10": 1010, "pp-11": 1011, "pp-12": 1012, "pp-13": 1013,
"pp-14": 1014, "pp-17": 1015, "pp-27": 1016}'}
Upvotes: 1
Views: 315
Reputation: 3400
I have created separate dictionary for to add data and finding data-text and id according to html
soup = BeautifulSoup(html, 'html.parser')
span=soup.find_all("span")
x={}
other_dict={}
x['d']=span[0].get("id")
for i in span[1:]:
other_dict[i.get("id")]=i.get_text()
After getting 2 dictionary now we can convert other_dict to string using json module and concanting both the data final output can be achieved!
import json
data=json.dumps(other_dict)
final=x['d']+data
x['d']=final
print(x)
Upvotes: 1