LuxLunae
LuxLunae

Reputation: 67

How to extract words out of a string with no spaces in python?

I am still somewhat new to python so I am stuck on a problem that I don't know how to solve this particular problem in it.

So we have a string like "ThisThingIsCool" or "thisthingiscool"

Now I need to somehow make a list like [This,Thing,Is,Cool] or [this,thing,is,cool]

Currently, I am using textblob but I am not sure if they even have such a way to do such a thing.

I mean I downloaded the corpus (I am guessing that it's a list of words), but did not see any function to recognize a word in a garbled string and extract words. Leaving a list as an output.

So I want to settle with at least being able to split the one with a Capitalized letter. However I have no clue how to go about that in python.

So the question is

  1. How do I recognize capitalized letters?

  2. How do i split it without having the delimiter consumed?

  3. Is there something in textblob that already does this?

Thank You

Upvotes: 0

Views: 2220

Answers (3)

Reena
Reena

Reputation: 1

s = "This is my Name" new_s = s.split() print(new_s)

['This', 'is', 'my', 'Name']

Upvotes: -1

Ahasanul Haque
Ahasanul Haque

Reputation: 11134

Use re module.

>>> a = 'ThisThingIsCool'
>>> import re
>>> re.findall(r'[A-Z][a-z]*', a)
['This', 'Thing', 'Is', 'Cool']
>>> [i.lower() for i in re.findall(r'[A-Z][a-z]*', a)]
['this', 'thing', 'is', 'cool']
>>> list(map(str.lower, re.findall(r'[A-Z][a-z]*', a)))
['this', 'thing', 'is', 'cool']

Upvotes: 0

DYZ
DYZ

Reputation: 57033

Splitting by capital letters is fairly easy with regular expressions:

s = "ThisThingIsCool"
re.findall(r'[A-Z][^A-Z]*', s)
#['This', 'Thing', 'Is', 'Cool']

The general solution is much harder and probably requires dynamic programming.

Upvotes: 3

Related Questions