Reputation: 83
I am trying to do web scraping to automate information collection instead of doing it manually.
For a given stock, a function (get_info) will return in a dictionary some information.
Example of output dictionnary
For company A
dict_A = {'enterpriseRevenue': 1.264,
'profitMargins': -0.00124,
'enterpriseToEbitda': 28.328,
'sharesOutstanding': 3907579904,
'bookValue': 8.326}
For company B
dict_B = {'enterpriseRevenue': 2.789,
'profitMargins': 2.34,
'enterpriseToEbitda': 28.328,
'sharesOutstanding': 2874818942,
'bookValue': 4.189}
From a list of stocks, I would like to create a data frame with all items of dictionary return by the get_info function. Desired algorithm in "natural language"
Create an empty data frame with 6 columns (first column for stock name, rest for dictionary items), called df
for s in list_of_stocks:
toto = get_info(s) # get the information for the stock, type(toto)=dict
add new line to df, which values correspond to toto
Example of desired output
Stock, enterpriseRevenue, profitMargins, enterpriseToEbitda, sharesOutstanding, bookValue
A, 1.264, -0.00124, 28.328, 3907579904, 8.326
B, 2.789, 2.34, 28.328, 2874818942, 4.189
Does anyone have any idea how to build this data frame?
Upvotes: 0
Views: 189
Reputation: 122
Try to use this.
# with the data structure like this, it might be easier to handle
data = {
"stockA": {
'enterpriseRevenue': 1.264,
'profitMargins': -0.00124,
'enterpriseToEbitda': 28.328,
'sharesOutstanding': 3907579904,
'bookValue': 8.326
},
"stockB": {
'enterpriseRevenue': 2.789,
'profitMargins': 2.34,
'enterpriseToEbitda': 28.328,
'sharesOutstanding': 2874818942,
'bookValue': 4.189
}
}
# getting the keys within the stockXY dict, which will be the column names
data_keys = data[list(data.keys())[0]].keys() # raises IndexError when data dictionary is empty
column_captions = ["stock"]+list(data_keys)
print(", ".join(map(str, column_captions)))
for stock, stock_data in data.items():
message = stock+", "+", ".join(map(str, stock_data.values()))
print(message)
It seems like you want to save the data to textfile... if so, you might take a look a json. https://docs.python.org/3/library/json.html
Upvotes: 1
Reputation: 5745
Did you try :
pd.DataFrame([get_info(d) for d in list_of_stocks])
Upvotes: 2