user6936420
user6936420

Reputation:

reading and slicing a 2d list in python

There is a file, just like this, called: test.txt:

John,19,7.5 Mary,22,9.8 Daniel,45,7.2 Hubert,92,10.0 Guy,28,9.5

I'm gonna extract the columns 2 to 4:

grades = np.genfromtxt(r'\test\test.txt',
                       delimiter=','
                       )

x = grades[:,1]
y = grades[:,2]
z = grades[:,3]

The interpreter says: IndexError: too many indices for array, however my slicing sounds to be ok.

What's the problem with that?

Upvotes: 0

Views: 701

Answers (2)

NaN
NaN

Reputation: 2322

It is better to specify a data type, when you are reading the file and employ the full benefits of numpy's structured arrays. For example

import numpy as np
in_file = 'c:/data/csv.txt'
dt = [('Name', 'U10'), ('Age', 'i8'), ('Grade','f8')]
a = np.genfromtxt(in_file, dtype=dt, delimiter=",")

This yields a file with a column data type (dtype). The field can be called by name and standard numpy methods can be employed.

>>> a
array([('John', 19, 7.5), ('Mary', 22, 9.8), ('Daniel', 45, 7.2),
       ('Hubert', 92, 10.0), ('Guy', 28, 9.5)], 
      dtype=[('Name', '<U10'), ('Age', '<i8'), ('Grade', '<f8')])
>>> a['Grade'].mean()
8.8000000000000007
>>> a['Age'].max()
92

You can also cast the data into a recarray if you prefer accessing via dot notation as in the following.

>>> b = a.view(np.recarray)
>>> b.Grade.mean()
8.8000000000000007
>>> b.Age.min()
19

If you this type of work alot, then people often use Pandas which provides a gentler interface and access to numpy arrays with mixed data types.

Upvotes: 0

tasosxak
tasosxak

Reputation: 109

import re

the_file = file("text.txt", 'r')

# x: the names , y: the integers , z: the floating numbers
x,y,z = [],[],[] 


for line in the_file:
    match = re.match('(\w+),(\d+),(\d+\.\d+)', line)
    if match:

       x.append(match.group(1))
       y.append(match.group(2))
       z.append(match.group(3))

print x
print y
print z

I suppose that the first number is an integer and the second decimal ..

If not so then we can change the regular expression

Upvotes: 1

Related Questions