mikew
mikew

Reputation: 924

Determine if perl scalar originally had one backslash or two

I have data that I obtain from a network service. It is valid for data to have a \\ in it. Also it is valid for data to have a single \ in it. Consider the following valid data inputs to my perl program. I'm not sure how I determine which data originally had a single \ vs a double \\.

$ cat data.pl
my $data ='=01=00=00=00=DF=FC=D3Y\=03';
my $data2='=01=00=00=00=DF=FC=D3Y\\=03';
print $data;

Notice the only difference in $data and $data2 in the above code is that $data2 has an extra backslash. Also I'm not trying to escape backslashes. The backslashes are just valid data in this data stream. Both are OK and happen in my data.

Debugging session:

$ perl -d data.pl

Loading DB routines from perl5db.pl version 1.37
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

main::(data.pl:1):      my $data='=01=00=00=00=DF=FC=D3Y\=03';
  DB<1> n
main::(data.pl:2):      my $data2='=01=00=00=00=DF=FC=D3Y\\=03';
  DB<1> x $data
0  '=01=00=00=00=DF=FC=D3Y\\=03'
  DB<2> p $data
=01=00=00=00=DF=FC=D3Y\=03
  DB<3> l
2==>    my $data2='=01=00=00=00=DF=FC=D3Y\\=03';
3:      print $data;
  DB<3> n
main::(data.pl:3):      print $data;
  DB<3> x $data2
0  '=01=00=00=00=DF=FC=D3Y\\=03'
  DB<4> p $data2
=01=00=00=00=DF=FC=D3Y\=03

So even though my inputs were different, perl considers them both the same data due to fact that \\ is a single backslash in a scalar and so is \. After the assignment statement, it's over for me it seems. I've lost whether or not the data had \\ or \.

It seems the perlio layer at some level handles this by escaping backslashes before they make it to a scalar? I'm not sure where I properly escape \ for data coming into my program.

Data flows from an HTTP service through LWP::UserAgent to some perl classes that eventually end up in my program. Is there a way to deal with this \\ vs \ in my data after it gets to a scalar?

EDIT

After further research and input from ikegami, I realize this question is now silly and I was getting confused at how the escaping of backslashes were happening in perl. Anything that accepts input escapes backslashes so that it can be properly represented inside perl. In my situation, I'm losing some backslashes along the data path that wasn't apparent to me.

$ perl -d data.pl

Loading DB routines from perl5db.pl version 1.37
Editor support available.

Enter h or 'h h' for help, or 'man perldebug' for more help.

main::(data.pl:5):      my $data='{ "data": "=01=00=00=00=DF=FC=D3Y\\\\=03" }';
  DB<1> n
main::(data.pl:6):      my $decoded = decode_json($data);
  DB<2> x $decoded
0  HASH(0x175fcf8)
   'data' => '=01=00=00=00=DF=FC=D3Y\\=03'

In my case, I'd have to re-escape backslashes going through json_decode.

Given the original question is silly and we are now in the realm of very specifics, I'd close this question.

Thanks.

Upvotes: 0

Views: 74

Answers (1)

ikegami
ikegami

Reputation: 385897

You seem to think my $data1 = '=01=00=00=00=DF=FC=D3Y\\=03'; puts '=01=00=00=00=DF=FC=D3Y\\=03' in the scalar, but that's completely wrong.

The string literal (i.e. the piece of code) '=01=00=00=00=DF=FC=D3Y\\=03' evaluates to the string =01=00=00=00=DF=FC=D3Y\=03. The assignment places that string in the scalar.

Similarly, the string literal '=01=00=00=00=DF=FC=D3Y\=03' evaluates to the string =01=00=00=00=DF=FC=D3Y\=03. The assignment places that string in the scalar.

Similarly, <$fh> evaluates to the string =01=00=00=00=DF=FC=D3Y\=03 (when reading from a file containing =01=00=00=00=DF=FC=D3Y\=03). The assignment places that string in the scalar.

There's no way to tell which of these pieces of code produced the string.


So even though my inputs were different, perl considers them both the same data due to fact that \\ is a single backslash in a scalar and so is \.

That makes no sense. There are no inputs in your example, and Perl is not doing any "considering". You simply have two equivalent pieces of code.


Data flows from an HTTP service through LWP::UserAgent to some perl classes that eventually end up in my program. Is there a way to deal with this \\ vs \ in my data after it gets to a scalar?

LWP::UserAgent will provide what the server returned. It doesn't perform any transformation of the kind you are describing.

Console 1:

$ nc -l 8888 <<'.'
HTTP/1.1 200 OK
Content-Type: text/plain

=01=00=00=00=DF=FC=D3Y\=03
=01=00=00=00=DF=FC=D3Y\\=03
.

Console 2:

$ perl -MLWP::UserAgent -e'print LWP::UserAgent->new->get("http://localhost:8888")->content'
=01=00=00=00=DF=FC=D3Y\=03
=01=00=00=00=DF=FC=D3Y\\=03

Upvotes: 4

Related Questions