Manish Shah
Manish Shah

Reputation: 401

Python Converting string into a list ignoring the special characters

I am having a string as :-

'Current Weather\n\t\n\n\t\t12:36 AM\n\t\n\n\n\n\t\t\t22°\n\t\t\n\n\t\t\t\tC\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t20°\n\t\n\n\t\tMostly clear'

I want to convert it into a list as :-

['Current Weather','12:36 AM','22°','C','RealFeel®','20°','Mostly clear']

Is there any python module or function with which I can do so?

Upvotes: 1

Views: 994

Answers (4)

user2390182
user2390182

Reputation: 73450

You can use re.split:

import re

s = 'Current Weather\n\t\n.....t\tMostly clear'
re.split(r'[\n\t]+', s)

Output:

['Current Weather', '12:36 AM', '22°', 'C', 'RealFeel®', '20°', 'Mostly clear']

Upvotes: 5

Sayed Hisham
Sayed Hisham

Reputation: 51

You could use Python regex. Here is an example:

import re
def sentance_to_list(sentence):
ls=re.split(r'["\t|\n"]\s*', sentence)   # split \t or \n
return ls

strr='Current Weather\n\t\n\n\t\t12:36 AM\n\t\n\n\n\n\t\t\t22°\n\t\t\n\n\t\t\t\tC\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t20°\n\t\n\n\t\tMostly clear'
newstrr=sentance_to_list(strr)
print(newstrr) 

output:

['Current Weather', '12:36 AM', '22°', 'C', 'RealFeel®', '20°', 'Mostly clear']

You could read more on re https://docs.python.org/3/library/re.html

Upvotes: 0

xkcdjerry
xkcdjerry

Reputation: 983

Why is everybody using re? This library is very slow.You can just use str.split,if you use it with arguments,you will have to do the str.isspace by hand,but it's still pretty fast,this is the code:

>>> [i.strip() for i in s.split('\n\t') if not i.isspace()]
['Current Weather', '12:36 AM', '22°', 'C', 'RealFeel®', '20°', 'Mostly clear']

Benchmarks:

>>> timeit.timeit(r"re.split(r'[\n\t]+', s)",r"""
import re
s = 'Current Weather\n\t\n\n\t\t12:36 AM\n\t\n\n\n\n\t\t\t22°\n\t\t\n\n\t\t\t\tC\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t20°\n\t\n\n\t\tMostly clear'
""")
2.8587728
timeit.timeit(r"[i.strip() for i in s.split('\n\t') if not i.isspace()]",r"""import re

s = 'Current Weather\n\t\n\n\t\t12:36 AM\n\t\n\n\n\n\t\t\t22°\n\t\t\n\n\t\t\t\tC\n\t\t\t\n\n\n\t\tRealFeel®\n\t\t20°\n\t\n\n\t\tMostly clear'
""")
1.8853902

Upvotes: 2

Pygirl
Pygirl

Reputation: 13349

Without regex:

[x.strip() for x in st.splitlines() if x.strip()!= '']

output:

['Current Weather', '12:36 AM', '22°', 'C', 'RealFeel®', '20°', 'Mostly clear']

Upvotes: 0

Related Questions