Animesh Sheolikar
Animesh Sheolikar

Reputation: 75

How to escape comma in csv perl

In my csv file there are thousands of records.And those records contain data with comma.How i should escape .I seen few links which mentioned about using module TEXT::CSV.But still issue persist ie recofds gets break where ever comma comes. Here is sample code i have been working:

 use Data::Dumper qw(Dumper);
 my $file = $ARGV[0] or die "Need to get CSV file on the command line\n";
 use Text::CSV;
 my $csv = Text::CSV->new({ sep_char => ',' });
 my $file = 'Pkg_Notes_small.csv';
 my %data;
  my %dupes;
  open(my $data, '<', $file) or die "Could not open '$file' $!\n";

 while (my $line = <$data>) 
 {

  my $catalog_num ='';
  my $catalog_sid ='';
  my $note_text ='';
  my $packing_notes_id ='';
  #($catalog_num,$catalog_sid,$packing_notes_id,$note_text) = split ',', $line;
  if ($csv->parse($line)) 
  {
      my @fields = $csv->fields();
      $catalog_num = $fields[0];
      $catalog_sid = $fields[1];
      $packing_notes_id = $fields[2];
      $note_text = $fields[3];
     }
   }

i will add sample data from csv file :

CATALOG_NUM  CATALOG_SID    PACKAGING_NOTES_SID    PACKAGING_NOTES_DESC


112194  ,     521     ,        77      ,        For WI Packaging Operations: For 
                               finishing  operations, the  use of a protective 

comma used above for clarity. Now converting this data to query based on my requirement for which i am getting:

for which Notetext field i am getting is :

    For WI Packaging Operations: For finishing operations  

remaining part for this line after processing is missed out.the data iam expecting is :

  For WI Packaging Operations: For 
  finishing  operations, the  use of a protective

Upvotes: 2

Views: 1711

Answers (1)

Dave Cross
Dave Cross

Reputation: 69244

It is possible to have commas in records in CSV files if the fields that contain them are quoted.

field1,field2,"field3, with embedded comma",field4

But Text::CSV knows all about that, so if you're using Text::CSV and still getting truncated records, then it seems likely that the problem is with your input data. If the embedded commas aren't quoted then obviously there is no way for the parser to differentiate between separator commas and commas that are part of the text.

field1,field2,field3, with embedded comma,field4

In the example above, how can the computer possibly know that the third comma should be ignored?

If this is your problem and you can't get the input data fixed, then there is one possible workaround. If the extra commas only ever exist in the last field of a record, then you can use the little-known third parameter to split() to limit the number of fields created from a record.

So if your record looks like this:

field1,field2,field3,field4 with, embedded, commas

You can use:

my @data = split /,/ $input, 4;

to force split() to only split the data into four fields. Anything after the third comma is all put into the fourth field.

Usually it's a very bad idea to parse CSV data using split() (because it doesn't know to ignore quoted commas), but in controlled circumstances like this, it can work well.

Update: Now I have seen some (well, one line!) of your input data, I see that my prediction was right. You have unquoted commas in your data. So using Text::CSV is not going to work.

But my split() solution works fine. Here's an example:

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;

while (<DATA>) {
  my @fields = split /,/, $_, 4;
  say $fields[3];
}

__END__
112194 , 521 , 77 , For WI Packaging Operations: For finishing operations, the use of a protective

And the output I get is:

For WI Packaging Operations: For finishing operations, the use of a protective

Upvotes: 6

Related Questions