Reputation:
I have a problem.
I want to get the content of a CSV file from an url and then parse it to an array. This is the code I have now:
import requests
import pandas as pd
import io
url="https://www.test.com/csv.php"
dataset = requests.get(url, verify=False).content
df = pd.read_csv(io.StringIO(dataset.decode('utf-8')))
data = []
for row in df: # each row is a list
data.append(row)
But when I execute this code, I only get the first row of the CSV and the values are between this -> '
['1', '4', '0']
The CSV file looks like this:
1,4,0
0,1,1
1,1,0
0,1,1
1,1,0
0,3,1
1,1,0
0,3,1
1,1,0
And I am hoping to get an array like this:
[[1,4,0],
[0,1,1],
[1,1,0],
[0,1,1],
[1,1,0],
[0,3,1],
[1,1,0],
[0,3,1],
[1,1,0]]
What am I doing wrong?
EDIT:
Using df.values gives me this:
[[0. 1. 1.]
[1. 1. 0.]
[0. 1. 1.]
...
[1. 1. 0.]
[0. 1. 1.]
[1. 3. 0.]]
But that does not seem to be correct, because the first row has to be [1,4,0]. Also I need a -> , <- as seperator
Upvotes: 0
Views: 626
Reputation: 46
When you are reading from a .csv file, by default, the first row is considered as a header row. You need to specify that it is not. So, add header=None
in read_csv
. Like this:
df = pd.read_csv(io.StringIO(dataset.decode('utf-8')), header=None)
Also, following is one of the ways of getting your desired output:
data=[]
for r1, r2, r3 in df.values:
data.append([r1,r2,r3])
Upvotes: 0
Reputation: 4757
No need to loop: .values
will return a matrix
url="https://www.test.com/csv.php"
dataset = requests.get(url, verify=False).content
df = pd.read_csv(io.StringIO(dataset.decode('utf-8')), header=None, sep=',')
data=df.values
Upvotes: 0
Reputation: 11
According to pandas documentation, to iterate rows you should use:
df.iterrows()
as indicated in http://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.iterrows.html#pandas.DataFrame.iterrows
Upvotes: 1