Anastasia
Anastasia

Reputation: 874

NumPy loadtxt data type

I am trying to load a data set that looks like this:

Algeria,73.131000,6406.8166213983,0.1
Angola,51.093000,5519.1831786593,2
Argentina,75.901000,15741.0457726686,0.5
Armenia,74.241000,4748.9285847709,0.1

etc. At the end, I will need only columns 1 and 2. I won't need country names and the last column. Essentially, I need to extract two matrices with dimensions nx1. I know that I need to specify the data type:

data=np.loadtxt('file.txt',delimiter=',',dtype=[('f0',str),('f1',float),('f2',float),('f3',float)])

However, this produces a list of tuples,

array([('', 73.131, 6406.8166213983, 0.1),
   ('', 51.093, 5519.1831786593, 2.0),`

instead of

array(['',73.131,6406.8166213983,0.1],
      ['',51.093, 5519.1831786593, 2.0],

Where is the mistake?

Upvotes: 5

Views: 34682

Answers (3)

user1301404
user1301404

Reputation:

Check NumPy's documentation.

x, y = np.loadtxt(c, delimiter=',', usecols=(1, 2), unpack=True)

The usecols parameter should get your job done.

Upvotes: 12

mamalos
mamalos

Reputation: 97

Your "mistake" is that you set your own dtype. If you don't want the dtype you've set (where I see no reason why you wouldn't want it), you can use skiprows and usecols parameters of np.loadtxt() to ONLY load the columns you wish.

Your result will be a NumPy array with a shape of (n, 2), not (n, 3) that you thought you'd have (where n is your number of rows).

Upvotes: 1

Lee
Lee

Reputation: 31040

If you just want the first two columns you could use genfromtxt:

import numpy as np
col1 = np.genfromtxt('yourfile.txt',usecols=(1),delimiter=',',dtype=None)
col2 = np.genfromtxt('yourfile.txt',usecols=(2),delimiter=',',dtype=None)

or both together:

np.genfromtxt('yourfile.txt',usecols=(1,2),delimiter=',',dtype=None)

Upvotes: 2

Related Questions