Reputation: 3
I have searched a lot, but haven't found an answer to this.
I am trying to pipe in a flat file with data and put into something python read and that I can do analysis with (for instance, perform a t-test).
First, I created a simple pipe delimited flat file:
1|2 3|4 4|5 1|6 2|7 3|8 8|9
and saved it as "simpledata".
Then I created a bash script in nano as
#!/usr/bin/env python
import sys
from scipy import stats
A = sys.stdin.read()
print A
paired_sample = stats.ttest_rel(A[:,0],A[:,1])
print "The t-statistic is %.3f and the p-value is %.3f." % paired_sample
Then I save the script as pairedttest.sh and run it as
cat simpledata | pairedttest.sh
The error I get is
TypeError: string indices must be integers, not tuple
Thanks for your help in advance
Upvotes: 0
Views: 890
Reputation: 94951
Are you trying to call this?:
paired_sample = stats.ttest_rel([1,3,4,1,2,3,8], [2,4,5,6,7,8,9])
If so, you can't do it the way you're trying. A
is just a string when you read it from stdin, so you can't index it the way you're trying. You need to build the two lists from the string. The most obvious way is like this:
left = []
right = []
for line in A.splitlines():
l, r = line.split("|")
left.append(int(l))
right.append(int(r))
print left
print right
This will output:
[1, 3, 4, 1, 2, 3, 8]
[2, 4, 5, 6, 7, 8, 9]
So you can call stats.ttest_rel(left, right)
Or to be really clever and make a (nearly impossible to read) one-liner out of it:
z = zip(*[map(int, line.split("|")) for line in A.splitlines()])
This will output:
[(1, 3, 4, 1, 2, 3, 8), (2, 4, 5, 6, 7, 8, 9)]
So you can call stats.ttest_rel(*z)
Upvotes: 1