Paul
Paul

Reputation: 255

How do I load specific rows from a .txt file in Python?

Say I have a .txt file with many rows and columns of data and a list containing integer values. How would I load the row numbers in the text file which match the integers in the list?

To illustrate, say I have a list of integers:

a = [1,3,5]

How would I read only rows 1,3 and 5 from a text file into an array?

The loadtxt routine in numpy let's you both skip rows and use particular columns. But I can't seem to find a way to do something along the lines of (ignoring incorrect syntax):

new_array = np.loadtxt('data.txt', userows=a, unpack='true')

Thank you.

Upvotes: 6

Views: 25241

Answers (5)

Shahbaz Khan
Shahbaz Khan

Reputation: 76

You can stick to using numpy's loadtxt method, except that you'll need to pass a generator object to the function instead of the file path.

First define a generator that accepts filename and row indices and yields only those lines at the specified indices

def generate_specific_rows(filePath, userows=[]):
    with open(filePath) as f:
        for i, line in enumerate(f):
            if i in userows:
                yield line

Now you can pass create a generator object and pass it to the loadtxt method

a = [1,3,5]
gen = generate_specific_rows('data.txt', userows=a)
new_array = np.loadtxt(gen, unpack='true')

Upvotes: 3

Jongsu Liam Kim
Jongsu Liam Kim

Reputation: 737

Use CSV module and Files.xreadlines().

  • CSV module: implements classes to read and write tabular data in CSV format

  • Files.xreadlines(): Return an iterator over the keys of the dictionary. This is a shortcut for iterkeys(). Deprecated since version 2.3: Use for line in file instead.

Upvotes: 0

Evgenia Galytska
Evgenia Galytska

Reputation: 161

I would suggest to use line.split () instead of line.strip(). line.split () returns the list, which can be easily converted to numpy.array by using np.asarray command.

Upvotes: 0

dawg
dawg

Reputation: 103834

Given this file:

1,2,3
4,5,6
7,8,9
10,11,12
13,14,15
16,17,18
19,20,21

You can use the csv module to get the desired np array:

import csv
import numpy as np

desired=[1,3,5]
with open('/tmp/test.csv', 'r') as fin:
    reader=csv.reader(fin)
    result=[[int(s) for s in row] for i,row in enumerate(reader) if i in desired]

print(np.array(result))   

Prints:

[[ 4  5  6]
 [10 11 12]
 [16 17 18]]

Upvotes: 5

Fredrik Pihl
Fredrik Pihl

Reputation: 45662

Just to expand on my comment

$ cat file.txt
line 0
line 1
line 2
line 3
line 4
line 5
line 6
line 7
line 8
line 9
line 10

Python:

#!/usr/bin/env python

a = [1, 4, 8]

with open('file.txt') as fd:
    for n, line in enumerate(fd):
        if n in a:
            print line.strip()

output:

$ ./l.py 
line 1
line 4
line 8

Upvotes: 4

Related Questions