Reputation: 21
I am trying to write a pandas dataframe as CSV to Bluemix Object Storage from a DSX Python notebook. I first save the dataframe to a 'local' CSV file. I then have a routine that attempts to write the file to Object Storage. I get a 413 response - object too large. The file is only about 3MB. Here's my code, based on a JSON example I found here: http://datascience.ibm.com/blog/working-with-object-storage-in-data-science-experience-python-edition/
import requests
def put_file(credentials, local_file_name):
"""This function writes file content to Object Storage V3 """
url1 = ''.join(['https://identity.open.softlayer.com', '/v3/auth/tokens'])
data = {'auth': {'identity': {'methods': ['password'],
'password': {'user': {'name': credentials['name'],'domain': {'id': credentials['domain']},
'password': credentials['password']}}}}}
headers = {'Content-Type': 'text/csv'}
with open(local_file_name, 'rb') as f:
resp1 = requests.post(url=url1, data=f, headers=headers)
return resp1
Any help or pointers is much appreciated.
Upvotes: 2
Views: 1512
Reputation: 575
This code snippet from the tutorial worked fine for me (for a 12 MB file).
from io import BytesIO
import requests
import json
import pandas as pd
def put_file(credentials, local_file_name):
"""This functions returns a StringIO object containing
the file content from Bluemix Object Storage V3."""
f = open(local_file_name,'r')
my_data = f.read()
url1 = ''.join(['https://identity.open.softlayer.com', '/v3/auth/tokens'])
data = {'auth': {'identity': {'methods': ['password'],
'password': {'user': {'name': credentials['username'],'domain': {'id': credentials['domain_id']},
'password': credentials['password']}}}}}
headers1 = {'Content-Type': 'application/csv'}
resp1 = requests.post(url=url1, data=json.dumps(data), headers=headers1)
resp1_body = resp1.json()
for e1 in resp1_body['token']['catalog']:
if(e1['type']=='object-store'):
for e2 in e1['endpoints']:
if(e2['interface']=='public'and e2['region']=='dallas'):
url2 = ''.join([e2['url'],'/', credentials['container'], '/', local_file_name])
s_subject_token = resp1.headers['x-subject-token']
headers2 = {'X-Auth-Token': s_subject_token, 'accept': 'application/json'}
resp2 = requests.put(url=url2, headers=headers2, data = my_data )
print resp2
I created a random pandas dataframe using:
df = pd.DataFrame(np.random.randint(0,100,size=(1000000, 4)), columns=list('ABCD'))
saved it to csv
df.to_csv('myPandasData_1000000.csv',index=False)
and then put it to object store
put_file(credentials_1,'myPandasData_1000000.csv')
You can get the credentials_1
object by clicking insert to code -> Insert credentials
for any object in your object store.
Upvotes: 5