Reputation: 845
So I'm a huge perl newbie but I'm trying to parse a tab delimited file into an array. The only problem i'm having is that my file has a varying amount of tabs used for spacing.
Right now im doing @data = split("\t");
but this only removes one tab, is there a way to remove all the tabs when parsing the file?
Upvotes: 2
Views: 6248
Reputation:
You could go like this
#!/usr/bin/perl
use strict;
use warnings;
open (IN, '<' , 'tabsfile');
my @arr;
while(<IN>) {
$_ =~ s/\s+/ /g;
push @arr, $_;
}
close(IN);
Output:
]# cat tabsfile
lkjdlksajdlkajsd kdjlkasjd ;lkwqd;wqd;qwkd;qwkd
lkewjflkjewflewjflwjf lkewjflkejflewjf
djflkajfdljf eljfdlewfjlewfj lkfjewlfkjewlkf lkdjewflkjewlkfjlkewjfew dlkejfdlkjewflkjewlkfjjdlkajdflkjalfdjelfj
dkjklfjldskjfldsjf lkjdslkfjdslkjf:wq
]# perl tabs.pl
lkjdlksajdlkajsd kdjlkasjd ;lkwqd;wqd;qwkd;qwkd lkewjflkjewflewjflwjf lkewjflkejflewjf djflkajfdljf eljfdlewfjlewfj lkfjewlfkjewlkf lkdjewflkjewlkfjlkewjfew dlkejfdlkjewflkjewlkfjjdlkajdflkjalfdjelfj dkjklfjldskjfldsjf lkjdslkfjdslkjf:wq
]#
You could chose what to replace \t
\s
in regex.
Upvotes: 0
Reputation: 208003
Simply replace multiple tabs with single tabs throughout the string before split()
# A line with varying numbers of tabs
my $line="\t\tField1\tField2\t\t\t\t\tField3";
# Replace all occurences of one or more tabs with single tab
$line =~ s/\t+/\t/g;
# Now split()
Upvotes: 2
Reputation: 61540
You can split on a regular expression, so if you need to split on one or more tab characters use:
@data = split("\t+");
example (Perl debugger):
DB<1> $text = "one\ttwo\t\tthree\t\t\tfour"
DB<2> @data = split("\t+", $text)
DB<3> print join(", ", @data)
one, two, three, four
Upvotes: 4