Reputation: 847
This is workfile.txt
NC_001778
NC_005252
NC_004744
NC_003096
NC_005803
I want to read it in array and have only the string without spaces or lines . this code does what I want on my laptop but it's not working on the linux desktop!
@nodes=<nodefile>;
chomp @nodes;
foreach my $el(@nodes){
chop ($el);
}
print Dumper @nodes;
#output: `bash-4.2$ perl main.pl
';AR1 = 'NC_000893
';AR2 = 'NC_001778
';AR3 = 'NC_005252
';AR4 = 'NC_004744
';AR5 = 'NC_003096
';AR6 = 'NC_005803
`
#hexdump -C workfile.txt |head -20
00000000 4e 43 5f 30 30 30 38 39 33 0d 0d 0a 4e 43 5f 30 |NC_000893...NC_0|
00000010 30 31 37 37 38 0d 0d 0a 4e 43 5f 30 30 35 32 35 |01778...NC_00525|
00000020 32 0d 0d 0a 4e 43 5f 30 30 34 37 34 34 0d 0d 0a |2...NC_004744...|
00000030 4e 43 5f 30 30 33 30 39 36 0d 0d 0a 4e 43 5f 30 |NC_003096...NC_0|
00000040 30 35 38 30 33 0d 0d 0a 4e 43 5f 30 30 36 35 33 |05803...NC_00653|
00000050 31 0d 0d 0a 4e 43 5f 30 30 34 34 31 37 0d 0d 0a |1...NC_004417...|
00000060 4e 43 5f 30 31 33 36 33 33 0d 0d 0a 4e 43 5f 30 |NC_013633...NC_0|
00000070 31 33 36 31 38 0d 0d 0a 4e 43 5f 30 30 32 37 36 |13618...NC_00276|
00000080 31 0d 0d 0a 4e 43 5f 30 31 33 36 32 38 0d 0d 0a |1...NC_013628...|
00000090 4e 43 5f 30 30 35 32 39 39 0d 0d 0a 4e 43 5f 30 |NC_005299...NC_0|
000000a0 31 33 36 30 39 0d 0d 0a 4e 43 5f 30 31 33 36 31 |13609...NC_01361|
000000b0 32 0d 0d 0a 4e 43 5f 30 30 32 36 34 36 0d 0d 0a |2...NC_002646...|
000000c0 4e 43 5f 30 30 34 35 39 35 0d 0d 0a 4e 43 5f 30 |NC_004595...NC_0|
000000d0 30 32 37 33 34 0d 0d 0a 4e 43 5f 30 30 34 35 39 |02734...NC_00459|
000000e0 38 0d 0d 0a 4e 43 5f 30 30 34 35 39 34 0d 0d 0a |8...NC_004594...|
000000f0 4e 43 5f 30 30 38 34 34 38 0d 0d 0a 4e 43 5f 30 |NC_008448...NC_0|
00000100 30 34 35 39 33 0d 0d 0a 4e 43 5f 30 30 32 36 34 |04593...NC_00264|
00000110 37 0d 0d 0a 4e 43 5f 30 30 32 36 37 34 0d 0d 0a |7...NC_002674...|
00000120 4e 43 5f 30 30 33 31 36 33 0d 0d 0a 4e 43 5f 30 |NC_003163...NC_0|
00000130 30 33 31 36 34 0d 0d 0a 4e 43 5f 30 32 30 31 35 |03164...NC_02015|
any suggestion ? thanks in advance
Upvotes: 3
Views: 347
Reputation: 67910
The problem is that you have Windows line endings in this file, which is why when you use linux, your chomp
is not removing line endings properly. It does not explain why chop
does not remove the last character, which should be \r
after chomp
.
Your output
';AR6 = 'NC_005803
Indicates that the last character in the string is in fact \r
. This is not an actual problem with the string, just with the visual representation. If you want to see this character written out literally, you can use the option
$Data::Dumper::Useqq = 1;
Which will then produce the output
$VAR6 = "NC_005803\r";
How to fix it?
A simple fix is to use the dos2unix
utility in linux to fix the file. To fix it in Perl, you can do something like
s/[\r\n]*\z// for @nodes; # remove all \r and \n from end of string
s/\s*\z// for @nodes; # remove all whitespace from end of string
s/\r//g for @nodes; # remove all \r from string
tr/\r//d for @nodes; # same
Upvotes: 3