Reputation: 37
As mentioned in the previous questions, I am using Beautiful soup with python to retrieve weather data from a website.
Here's how the website looks like:
<channel>
<title>2 Hour Forecast</title>
<source>Meteorological Services Singapore</source>
<description>2 Hour Forecast</description>
<item>
<title>Nowcast Table</title>
<category>Singapore Weather Conditions</category>
<forecastIssue date="18-07-2016" time="03:30 PM"/>
<validTime>3.30 pm to 5.30 pm</validTime>
<weatherForecast>
<area forecast="TL" lat="1.37500000" lon="103.83900000" name="Ang Mo Kio"/>
<area forecast="SH" lat="1.32100000" lon="103.92400000" name="Bedok"/>
<area forecast="TL" lat="1.35077200" lon="103.83900000" name="Bishan"/>
<area forecast="CL" lat="1.30400000" lon="103.70100000" name="Boon Lay"/>
<area forecast="CL" lat="1.35300000" lon="103.75400000" name="Bukit Batok"/>
<area forecast="CL" lat="1.27700000" lon="103.81900000" name="Bukit Merah"/>`
<channel>
I managed to retrieve the information I need using these codes :
import requests
from bs4 import BeautifulSoup
import urllib3
#getting the ValidTime
r = requests.get('http://www.nea.gov.sg/api/WebAPI/?
dataset=2hr_nowcast&keyref=781CF461BB6606AD907750DFD1D07667C6E7C5141804F45D')
soup = BeautifulSoup(r.content, "xml")
time = soup.find('validTime').string
print "validTime: " + time
#getting the date
for currentdate in soup.find_all('item'):
element = currentdate.find('forecastIssue')
print "date: " + element['date']
#getting the time
for currentdate in soup.find_all('item'):
element = currentdate.find('forecastIssue')
print "time: " + element['time']
for area in soup.find('weatherForecast').find_all('area'):
area_attrs_li = [area.attrs for area in soup.find('weatherForecast').find_all('area')]
print area_attrs_li
Here are my results :
{'lat': u'1.34039000', 'lon': u'103.70500000', 'name': u'Jurong West',
'forecast': u'LR'}, {'lat': u'1.31200000', 'lon': u'103.86200000', 'name':
u'Kallang', 'forecast': u'LR'},
I'm not strong in Python and have been stuck at this for quite a while.
EDIT : I tried doing this :
f = open("C:\\scripts\\nea.csv" , 'wt')
try:
for area in area_attrs_li:
writer = csv.writer(f)
writer.writerow( (time, element['date'], element['time'], area_attrs_li))
finally:
f.close()
print open("C:/scripts/nea.csv", 'rt').read()
It worked however, I would like to split the area apart as the records are duplicates in the CSV :
Thank you.
Upvotes: 0
Views: 613
Reputation: 780
EDIT 1 -Topic:
You're missing escape characters:
C:\scripts>python neaweather.py
File "neaweather.py", line 30
writer.writerow( ('time', 'element['date']', 'element['time']', 'area_attrs_li') )
writer.writerow( ('time', 'element[\'date\']', 'element[\'time\']', 'area_attrs_li')
^
SyntaxError: invalid syntax
EDIT 2:
if you want to insert values:
writer.writerow( (time, element['date'], element['time'], area_attrs_li) )
EDIT 3:
to split the result to different lines:
for area in area_attrs_li:
writer.writerow( (time, element['date'], element['time'], area)
EDIT 4: The splitting is not correct at all, but it shall give a better understanding of how to parse and split data to change it for your needs. to split the area element again as you show in your image, you can parse it
for area in area_attrs_li:
# cut off the characters you don't need
area = area.replace('[','')
area = area.replace(']','')
area = area.replace('{','')
area = area.replace('}','')
# remove other characters
area = area.replace("u'","\"").replace("'","\"")
# split the string into a list
areaList = area.split(",")
# create your own csv-seperator
ownRowElement = ';'.join(areaList)
writer.writerow( (time, element['date'], element['time'], ownRowElement)
Offtopic: This works for me:
import csv
import json
x="""[
{'lat': u'1.34039000', 'lon': u'103.70500000', 'name': u'Jurong West','forecast': u'LR'}
]"""
jsontxt = json.loads(x.replace("u'","\"").replace("'","\""))
f = csv.writer(open("test.csv", "w+"))
# Write CSV Header, If you dont need that, remove this line
f.writerow(['lat', 'lon', 'name', 'forecast'])
for jsontext in jsontxt:
f.writerow([jsontext["lat"],
jsontext["lon"],
jsontext["name"],
jsontext["forecast"],
])
Upvotes: 1