chas
chas

Reputation: 1645

read value from file to variable in python

I have text file with text as shown below. I want to write the value in the first column of the 8th row i.e. 226 to a variable using a function in python. Could someone help to do this?

## net.sf.picard.metrics.StringHeader
# net.sf.picard.analysis.CollectInsertSizeMetrics 
## net.sf.picard.metrics.StringHeader
# Started on: Mon Sep 16 22:48:21 EEST 2013

## METRICS CLASS        net.sf.picard.analysis.InsertSizeMetrics
MEDIAN_INSERT_SIZE      MEDIAN_ABSOLUTE_DEVIATION       MIN_INSERT_SIZE MAX_INSERT_SIZE       
226     41      2       121947929       235.101052      64.322693       43832988
FR      17      33      49      65      83      103     127     155     205     397 

Upvotes: 0

Views: 1244

Answers (1)

abarnert
abarnert

Reputation: 365597

Your file is not quite a CSV/TSV file, so using the csv module will probably end up being as tricky as parsing it manually in this case. So let's just do that:

with open(filename) as f:
    for i, row in enumerate(f):
        if i == 7: # 8th row
            columns = row.split()
            value = columns[0] # 1st column
            break

This has the advantage that we're only reading and parsing the first 8 lines rather than the entire file.


If you understand iterables, I find (which does the exact same thing) this much simpler:

with open(filename) as f:
    value = more_itertools.nth(f, 7).split()[0]

I used the third-party more-itertools module for simplicity. If you don't want to install it, nth is defined in the recipes in the documentation for the standard library itertools module, so you can just copy and paste it like any other recipe:

def nth(iterable, n, default=None):
    "Returns the nth item or a default value"
    return next(itertools.islice(iterable, n, None), default)

Or you could just inline it into a single more complicated expression:

with open(filename) as f:
    value = next(itertools.islice(f, 7, None)).split()[0]

(Personally, I find that a bit less readable; it's like saying "the first row of all the rows from #7 to the end" instead of just saying "row #7". But some people don't like to define lots of trivial functions.)


I'd probably wrap this as a function (just return … instead of value = … and break, depending on which version you use):

def get_row_col(filename, row, col):
    with open(filename) as f:
        return more_itertools.nth(f, row).split()[col]

value = get_row_col(filename, 7, 0)

Another way to get line #7 from a file without reading the whole file is with the linecache module:

def get_row_col(filename, row, col):
    row = linecache.getline(filename, 7)
    return row.split()[col]

This will be a lot more efficient if you're calling it lots of different times, for lots of different rows of the same filename.

Upvotes: 1

Related Questions