Reputation: 61
I'm trying to use data from US treasury website https://home.treasury.gov/resource-center/data-chart-center/interest-rates/pages/xml?data=daily_treasury_yield_curve&field_tdr_date_value=all&page=24
df = pd.read_xml\
(path_or_buffer="https://home.treasury.gov/resource-center/data-chart-center/interest-rates/pages/xml?data=daily_treasury_yield_curve&field_tdr_date_value=all&page=24",\
xpath="//m:properties"
)
I think I need to use namespace for read_xml, but I'm not sure how. I want to get a dataframe such that for each row, it contains the timestamp, and the BC_1MONTH, BC_3MONTH, etc. data
Upvotes: 0
Views: 453
Reputation: 120479
You can manually specify the namespace:
NAMESPACES = {'m': 'http://schemas.microsoft.com/ado/2007/08/dataservices/metadata'}
df = pd.read_xml(your_url, xpath='//m:properties', namespaces=NAMESPACES)
Output:
>>> df
Id NEW_DATE BC_1MONTH BC_3MONTH BC_6MONTH BC_1YEAR BC_2YEAR BC_3YEAR BC_5YEAR BC_7YEAR BC_10YEAR BC_20YEAR BC_30YEAR BC_30YEARDISPLAY BC_2MONTH
0 7202 2018-10-09T00:00:00 2.17 2.25 2.46 2.65 2.88 2.98 3.05 3.15 3.21 3.30 3.37 3.37 NaN
1 7203 2018-10-10T00:00:00 2.18 2.27 2.45 2.67 2.88 2.97 3.05 3.15 3.22 3.33 3.39 3.39 NaN
2 7204 2018-10-11T00:00:00 2.14 2.27 2.44 2.66 2.85 2.94 3.00 3.09 3.14 3.25 3.32 3.32 NaN
3 7205 2018-10-12T00:00:00 2.14 2.28 2.44 2.66 2.85 2.93 3.00 3.09 3.15 3.25 3.32 3.32 NaN
4 7206 2018-10-15T00:00:00 2.17 2.31 2.47 2.67 2.85 2.94 3.01 3.10 3.16 3.27 3.34 3.34 NaN
.. ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
295 7497 2019-12-16T00:00:00 1.57 1.57 1.58 1.54 1.65 1.67 1.72 1.82 1.89 2.17 2.30 2.30 1.57
296 7498 2019-12-17T00:00:00 1.56 1.56 1.58 1.53 1.63 1.66 1.71 1.82 1.89 2.18 2.31 2.31 1.56
297 7499 2019-12-18T00:00:00 1.56 1.56 1.58 1.54 1.63 1.67 1.74 1.86 1.92 2.22 2.35 2.35 1.57
298 7500 2019-12-19T00:00:00 1.54 1.57 1.57 1.52 1.62 1.65 1.73 1.84 1.92 2.21 2.35 2.35 1.58
299 7501 2019-12-20T00:00:00 1.57 1.58 1.58 1.52 1.63 1.67 1.73 1.84 1.92 2.21 2.34 2.34 1.59
[300 rows x 15 columns]
Upvotes: 2