jds
jds

Reputation: 61

read xml from a url into pandas dataframe

I'm trying to use data from US treasury website https://home.treasury.gov/resource-center/data-chart-center/interest-rates/pages/xml?data=daily_treasury_yield_curve&field_tdr_date_value=all&page=24

df = pd.read_xml\
(path_or_buffer="https://home.treasury.gov/resource-center/data-chart-center/interest-rates/pages/xml?data=daily_treasury_yield_curve&field_tdr_date_value=all&page=24",\
 xpath="//m:properties"
)

I think I need to use namespace for read_xml, but I'm not sure how. I want to get a dataframe such that for each row, it contains the timestamp, and the BC_1MONTH, BC_3MONTH, etc. data

Screenshot of the data I want

Upvotes: 0

Views: 453

Answers (1)

Corralien
Corralien

Reputation: 120479

You can manually specify the namespace:

NAMESPACES = {'m': 'http://schemas.microsoft.com/ado/2007/08/dataservices/metadata'}
df = pd.read_xml(your_url, xpath='//m:properties', namespaces=NAMESPACES)

Output:

>>> df
       Id             NEW_DATE  BC_1MONTH  BC_3MONTH  BC_6MONTH  BC_1YEAR  BC_2YEAR  BC_3YEAR  BC_5YEAR  BC_7YEAR  BC_10YEAR  BC_20YEAR  BC_30YEAR  BC_30YEARDISPLAY  BC_2MONTH
0    7202  2018-10-09T00:00:00       2.17       2.25       2.46      2.65      2.88      2.98      3.05      3.15       3.21       3.30       3.37              3.37        NaN
1    7203  2018-10-10T00:00:00       2.18       2.27       2.45      2.67      2.88      2.97      3.05      3.15       3.22       3.33       3.39              3.39        NaN
2    7204  2018-10-11T00:00:00       2.14       2.27       2.44      2.66      2.85      2.94      3.00      3.09       3.14       3.25       3.32              3.32        NaN
3    7205  2018-10-12T00:00:00       2.14       2.28       2.44      2.66      2.85      2.93      3.00      3.09       3.15       3.25       3.32              3.32        NaN
4    7206  2018-10-15T00:00:00       2.17       2.31       2.47      2.67      2.85      2.94      3.01      3.10       3.16       3.27       3.34              3.34        NaN
..    ...                  ...        ...        ...        ...       ...       ...       ...       ...       ...        ...        ...        ...               ...        ...
295  7497  2019-12-16T00:00:00       1.57       1.57       1.58      1.54      1.65      1.67      1.72      1.82       1.89       2.17       2.30              2.30       1.57
296  7498  2019-12-17T00:00:00       1.56       1.56       1.58      1.53      1.63      1.66      1.71      1.82       1.89       2.18       2.31              2.31       1.56
297  7499  2019-12-18T00:00:00       1.56       1.56       1.58      1.54      1.63      1.67      1.74      1.86       1.92       2.22       2.35              2.35       1.57
298  7500  2019-12-19T00:00:00       1.54       1.57       1.57      1.52      1.62      1.65      1.73      1.84       1.92       2.21       2.35              2.35       1.58
299  7501  2019-12-20T00:00:00       1.57       1.58       1.58      1.52      1.63      1.67      1.73      1.84       1.92       2.21       2.34              2.34       1.59

[300 rows x 15 columns]

Upvotes: 2

Related Questions