Reputation: 568

Explain the 3 different flavors of string eval in perl

I don't understand the explanations in the Perl documentation for the 3 different string eval types. It doesn't help that the docs are garbled, with text missing, so that the beginning of a sentence doesn't fit with its end.

The three flavors are

eval
evalbytes^[1]
eval with use feature qw( unicode_eval );

Requires use feature qw( eval_bytes ); to use, or CORE::evalbytes.

Upvotes: 1

Answers (2)

khw

Reputation: 568

I ended up doing a bunch of experiments to get the answer to this question. I had already ruled out Schwern's answer before I posted the question. The documentation that was somewhat garbled was in perlfunc. (That and the documentation in feature.pm largely overlapped.) The result is that I've changed the documentation to reflect what I found. This is now in perl 5.25.10, and should be in 5.26. I changed the feature.pm documentation to be just a short summary with no detail, linking to perlfunc for the meat of the behavior. Here is a link to the difference listing of the commit that changed this. I believe that it adequately describes the way things actually work. But patches are welcome.

commit diff

The text is weak on the flaws that existed before the features were added. I examined all the emails I could find on the subject from the perl5-porters email list, but didn't find anything beyond what I wrote in the patch

Upvotes: 1

Schwern

Reputation: 164629

use feature qw( unicode_eval ) is there to fix confusing quirks with vanilla eval that cannot be fixed without breaking backwards compatibility.

eval behaves differently depending on the internal encoding of the string, sometimes treating its argument as a string of bytes, and sometimes as a string of characters.

Source filters activated within eval leak out into whichever file scope is currently being compiled.

With use feature qw( unicode_eval ) this changes. Now eval will always treat its code as characters (ie. UTF-8) and will not leak. This is most likely the behavior you want.

For those who really, really want code that's interpreted as bytes (ie. ASCII) there is evalbytes, but you probably don't need that.

tl;dr: If you're using 5.16 or newer and you're using eval (which you probably shouldn't), use feature qw( unicode_eval ) and eval. It supports UTF-8 and it fixes eval quirks.

Or just use utf8::all and forget about it.

Upvotes: 6

Explain the 3 different flavors of string eval in perl

Answers (2)

Related Questions