dot
dot

Reputation: 87

Python - Joining characters of string with space

I'm trying to handle an string input. At first I joined the input with \n so I could have each word in one line (which is what I need):

some
random
words
written

and transform it into something like this:

s o m e
r a n d o m
w o r d s
w r i t t e n

But for some reason I'm getting random spaces at the start of a line but not every line. There are zero spaces in the input, I checked specifically. I'm not sure where those extra spaces are coming from.

Here's my code:

input = "some random words written"
string = '\n'.join(re.findall(r"\w{4,}", input)) #regex bc I need the words to be at least 4 characters
space = " ".join(string)
print(space)

This gives me something like this:

s o m e
 r a n d o m
 w o r d s
 w r i t t e n

Anyone might have a clue why?

Upvotes: 3

Views: 10043

Answers (4)

Pedro Lobito
Pedro Lobito

Reputation: 98981

You can use a list comprehension instead of a regex, i.e.:

print("\n".join(' '.join(x) for x in input.split() if len(x) > 3 ))

If you really need a regex, use:

print("\n".join(' '.join(x) for x in re.findall('\w{4,}', input)))

output:

s o m e
r a n d o m
w o r d s
w r i t t e n

Upvotes: 0

rassar
rassar

Reputation: 5670

Try this:

'\n'.join(' '.join(i) for i in text.split() if len(i) >= 4)

First, find all words greater or equal to four letters.

Next join those words by space. Since str is iterable it will put a space inbetween each of the letter.

Then join it by \n and you’re done!

>>> text = "some random words written"
>>> print('\n'.join(' '.join(i) for i in text.split() if len(i) >= 4))
s o m e
r a n d o m
w o r d s
w r i t t e n

The reason your solution does not work is because it is putting a space in between the newlines and the new character. join puts the space in between every single character.

Upvotes: 0

hiro protagonist
hiro protagonist

Reputation: 46899

you could do that with 1 generator and without regex:

strg = "some random words written"
print('\n'.join(' '.join(word) for word in strg.split() if len(word) > 3))

started the same way as this answer; mine is very similar but as i got a solution that is a little shorter i still posted it...

and input is a built-in; avoid those as variable names.

Upvotes: 0

Jim Dennis
Jim Dennis

Reputation: 17510

I would not use regular expressions for this.

[x for x in input.split() if len(x) > 3]

... will filter out words of less than 4 character.

[' '.join(y) for y in [x for x in input.split() if len(x) > 3]]

... will turn that into a list of "words" with each word "spaced out."

So you can do it all with:

'\n'.join([' '.join(y) for y in [x for x in input.split() if len(x) > 3]])

It's often best to build up your functional code snippets using an iterative "bottom up" approach such as I've shown here. Also regular expressions tend to be slow and somewhat dangerous. You're relying on a sophisticated and complex set of parsers for interpreting and applying your regular expressions. When you can avoid them, it's usually good to do so. the code is likely to run faster and be more robust.

Upvotes: 2

Related Questions