Minions
Minions

Reputation: 5467

read text file into matrix - python

I have a text file which contains m rows like the following:

0.4698537878,0.1361006627,0.2400000000,0.7209302326,0.0054816275,0.0116666667,1 0.5146649986,0.0449680289,0.4696969697,0.5596330275,0.0017155500,0.0033333333,0 0.4830107706,0.0684999306,0.3437500000,0.5600000000,0.0056351257,0.0116666667,0 0.4458490073,0.1175445834,0.2307692308,0.6212121212,0.0089169801,0.0200000000,0

I tried to read the file and copy it into a matrix like in the following code:

import string

file = open("datasets/train.txt",encoding='utf8')

for line in file.readlines():
    tmp = line.strip()
    tmp = tmp.split(",")
    idx = np.vstack(tmp)
    idy = np.hstack(tmp[12])

matrix = idx

I want to read the file as its into the matrix, in my sample data the matrix size should be: (4,6) and idy: (4,1) # the last line, the labels

but it stacked the last line of the file vertically !? like that:

0.4458490073,

0.1175445834,

0.2307692308,

0.6212121212,

0.0089169801,

0.0200000000,

0

any help?

Upvotes: 1

Views: 2171

Answers (2)

jpp
jpp

Reputation: 164623

Since you are using numpy, this functionality is already available:

arr = np.genfromtxt('file.csv', delimiter=',')

You can then separate headers as follows:

data = arr[:, :-1]
header = arr[:, -1:]

Upvotes: 3

datawrestler
datawrestler

Reputation: 1567

This should get you the right shape (4,6) for the idx variable and (4,1) for the labels

alllines = open('train.txt', 'r').readlines()
# shape (4,6)
idx = np.matrix([line.replace('\n', '').split(',')[0:6] for line in alllines])
# reshape to (4,1) for labels
idy = np.matrix([line.replace('\n', '').split(',')[6] for line in alllines]).reshape(-1, 1)

Upvotes: 1

Related Questions