Marc B. Hankin
Marc B. Hankin

Reputation: 751

Python encoding errors from comments containing Windows paths

I want to include Windows paths in python script comments, without causing an encoding error.

If I include a Windows path in a comment, I will sometimes get an encoding error, e.g., "UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa6 in position 4612: invalid start byte".
I found one "article" which indicated that including a Windows path in a comment can trigger a unicode error, https://programmersought.com/article/28013377080/.

On the other hand, sometimes I can include a Windows path in a comment, without triggering a unicode error. I don't understand why some Windows paths trigger errors, and other paths do not.

The following are a few examples of Windows paths that do, or do not cause encoding errors, as indicated below:

'''

OK      # E:\Apps\ParticlesByMarc\regularexpression_info_SAVE_aaa_.py
ERROR   # E:\Apps\UnitiesByMarc\regularexpression_info_SAVE_aaa_.py
OK      # E:\Apps\ UnitiesByMarc\regularexpression_info_SAVE_aaa_.py# File 
ERROR   # E:\ Apps\ UnitiesByMarc\xxx\regularexpression_info_SAVE_aaa_py
OK      # E:\ Apps\ UnitiesByMarc\ xxx\regularexpression_info_SAVE_aaa_py
OK      # File E:\ Apps\ UnitiesByMarc\x123x\regularexpression_info_SAVE_aaa_py

'''

I cannot figure out what makes two of those Windows path formats OK to be included in a comment, and the other four not OK to be included in a comment.

My questions:

  1. Is there something I could do to format the comment so that I would not have to insert a space after each backslash?
  2. If there are other limits on text that can be included in a comment, where can I find a list of those limits?
  3. Where can I find the rules that identify and explain the reason for the limitations?

Any suggestions about how to find the answer would be very welcome.

Thanks, Marc

Upvotes: 1

Views: 351

Answers (1)

JosefZ
JosefZ

Reputation: 30123

A triple quoted string isn't a comment; it's a string which could become a Docstring:

A docstring is a string literal that occurs as the first statement in a module, function, class, or method definition. Such a docstring becomes the __doc__ special attribute of that object.

Example:

def somefunc(somepar):
  r'''
This is a docstring

  E:\Apps\ParticlesByMarc\regularexpression_info_SAVE_aaa_.py
  E:\Apps\UnitiesByMarc\regularexpression_info_SAVE_aaa_.py
# E:\Apps\UnitiesByMarc\regularexpression_info_SAVE_aaa_.py # File 
# E:\Apps\UnitiesByMarc\xxx\regularexpression_info_SAVE_aaa_py
  E:\Apps\UnitiesByMarc\xxx\regularexpression_info_SAVE_aaa_py
# File E:\Apps\UnitiesByMarc\x123x\regularexpression_info_SAVE_aaa_py

  '''
  print('supplied:', somepar, end='\n\n')
  '''
This isn't recognized as a docstring (i.e. not assigned to __doc__)
  '''


somefunc('par')
help(somefunc)

Result: .\SO\68553726.py

supplied: par

Help on function somefunc in module __main__:

somefunc(somepar)
    This is a docstring

      E:\Apps\ParticlesByMarc\regularexpression_info_SAVE_aaa_.py
      E:\Apps\UnitiesByMarc\regularexpression_info_SAVE_aaa_.py
    # E:\Apps\UnitiesByMarc\regularexpression_info_SAVE_aaa_.py # File
    # E:\Apps\UnitiesByMarc\xxx\regularexpression_info_SAVE_aaa_py
      E:\Apps\UnitiesByMarc\xxx\regularexpression_info_SAVE_aaa_py
    # File E:\Apps\UnitiesByMarc\x123x\regularexpression_info_SAVE_aaa_py

Upvotes: 1

Related Questions