Reputation: 2261
I have the following code:
input=File.open("lala.txt","r")
genes=[]
input.each_line{|li|
keys=li.split("\t")
length=keys.length
puts(keys[length-2])
puts(keys[length-2].to_f)
if (keys[0]["-"].class==NilClass && keys[1]["-"].class==NilClass && (keys[length-2]).to_f>0.98)
genes.push(keys[0])
genes.push(keys[1])
end
}
inputfile:
1053_at/RFC2 203696_s_at/RFC2 0.9031699692435061
117_at/HSPA6 1553158_at/C3orf34 0.9079515773059148
117_at/HSPA6 1553513_at/VNN3 0.9237382047518812
117_at/HSPA6 1553723_at/GPR97 0.9367168572635286
117_at/HSPA6 1557852_at/--- 0.9177916032275163
117_at/HSPA6 1558525_at/--- 0.9229865774037962
117_at/HSPA6 1562481_at/--- 0.9109034368848434
117_at/HSPA6 1569385_s_at/TET2 0.9187904542249753
117_at/HSPA6 1569830_at/PTPRC 0.900051189462974
117_at/HSPA6 1569955_at/--- 0.9028606652628463
117_at/HSPA6 201393_s_at/IGF2R 0.9090699277161238
My problem is following:
I want to compare the number in each row with >0.98.
If I write it just keys[length-2]>0.98
it shows me an error that I want to compare a String with a float. OK. Lets convert a String to Float then by doing this: (keys[length-2]).to_f . In converted it BUT it destroys the number: I get 0.0
output:
0.9031699692435061
0.0
0.9079515773059148
0.0
0.9237382047518812
0.0
0.9367168572635286
0.0
0.9177916032275163
0.0
0.9229865774037962
0.0
0.9109034368848434
0.0
0.9187904542249753
0.0
0.900051189462974
0.0
0.9028606652628463
0.0
0.9090699277161238
0.0
0.9002336615360215
0.0
What is wrong then?(Ruby: linux 1.9.3) Thanks in advance
Upvotes: 0
Views: 182
Reputation: 84182
judging by all the null bytes in there, what you've got is utf16 text that you are interpreting as utf8 or ascii. Assuming you are on ruby 1.9, you can get ruby to do the encoding by doing
File.open("lala.txt","rb:UTF-16:US-ASCII")
which will convert the text into the default internal encoding.
Upvotes: 1
Reputation: 160611
Your code could be written more Ruby-like, and take advantage of a well-tested wheel:
require 'csv'
genes = []
CSV.foreach("lala.txt", :col_sep => "\t") do |row|
puts row[-1]
puts row[-1].to_f
if (!row[0]["-"] && !row[1]["-"] && (row[-1].to_f > 0.98))
genes << row[0]
genes << row[1]
end
end
puts genes
This is the output:
0.9031699692435061
0.9031699692435061
0.9079515773059148
0.9079515773059148
0.9237382047518812
0.9237382047518812
0.9367168572635286
0.9367168572635286
0.9177916032275163
0.9177916032275163
0.9229865774037962
0.9229865774037962
0.9109034368848434
0.9109034368848434
0.9187904542249753
0.9187904542249753
0.900051189462974
0.900051189462974
0.9028606652628463
0.9028606652628463
0.9090699277161238
0.9090699277161238
And genes
is empty because no values in the last column are > 0.98
.
Upvotes: 0
Reputation: 34061
I think you've got some weird whitespace issues. I think if you split on /\s+/
and just use keys.last
you should be good:
input=File.open("lala.txt","r")
genes=[]
input.each_line{|li|
keys=li.split(/\s+/)
puts(keys.last)
puts(keys.last.to_f)
if (keys[0]["-"].class==NilClass && keys[1]["-"].class==NilClass && (keys.last).to_f>0.98)
genes.push(keys[0])
genes.push(keys[1])
end
}
Upvotes: 0