user2052436
user2052436

Reputation: 4765

What does strip() consider whitespace?

Python's strip functions remove whitespace by default.

What is Python's whitespace?

Is it same as isspace in C/C++, i.e. includes new line, vertical tab, etc.?

Upvotes: 2

Views: 140

Answers (2)

user2357112
user2357112

Reputation: 280867

Python's definition of whitespace, as used by str.strip and str.isspace, is as follows:

A character is whitespace if in the Unicode character database (see unicodedata), either its general category is Zs (“Separator, space”), or its bidirectional class is one of WS, B, or S.

This is different from C's isspace, as it includes Unicode characters outside the ASCII range, as well as a few ASCII characters that C's isspace does not count as whitespace. It is also different from string.whitespace, even for ASCII characters.

As of CPython 3.8.1, the complete list (as defined in the source code, and subject to change) is as follows:

/* Returns 1 for Unicode characters having the bidirectional
 * type 'WS', 'B' or 'S' or the category 'Zs', 0 otherwise.
 */
int _PyUnicode_IsWhitespace(const Py_UCS4 ch)
{
    switch (ch) {
    case 0x0009:
    case 0x000A:
    case 0x000B:
    case 0x000C:
    case 0x000D:
    case 0x001C:
    case 0x001D:
    case 0x001E:
    case 0x001F:
    case 0x0020:
    case 0x0085:
    case 0x00A0:
    case 0x1680:
    case 0x2000:
    case 0x2001:
    case 0x2002:
    case 0x2003:
    case 0x2004:
    case 0x2005:
    case 0x2006:
    case 0x2007:
    case 0x2008:
    case 0x2009:
    case 0x200A:
    case 0x2028:
    case 0x2029:
    case 0x202F:
    case 0x205F:
    case 0x3000:
        return 1;
    }
    return 0;
}

Upvotes: 4

Aaron Bentley
Aaron Bentley

Reputation: 1380

Yes, it includes newline and vertical tab. The full definition is accessible as string.whitespace.

https://docs.python.org/3.8/library/string.html?highlight=whitespace#string.whitespace

Upvotes: 1

Related Questions