Rob M
Rob M

Reputation: 11

Adding spaces within a string in Python?

I'm relatively new to Python (using v2.7.3) and I decided to test my skills out with editing a text document comprised of all the texts I've received on my phone. I want to edit out the useless information so I wrote a script to do that, but all the spaces between words are being deleted.

Here's a sample of the input data:

sms protocol="932" address="XXXXXXXXXX" date="1305655717379" type="1" subject="null" body="Talk to joey?" toa="null" sc_toa="null" service_center="null" read="1" status="-1" locked="0" date_sent="null" readable_date="May 17, 2011 2:08:37 PM" contact_name="David XXXX" />

Here's a sample of the output data:

body="Talktojoey?"toa="null"sc_toa="null"service_center="null"read="1"status="-1"locked="0"date_sent="null"readable_date="May17,20112:08:37PM"contact_name="DavidXXXX/>

Here's my code:

line= textfile.readline() 
for line in textfile:

    line = line.strip() 
    line = line.split(' ')     
    del line[0:6]
    line.append("\n")
    print line
    output.writelines(line)

textfile.close()

output.close()

Any help on how to add spaces would be greatly appreciated. Thanks!

Upvotes: 1

Views: 4771

Answers (3)

Amber
Amber

Reputation: 527418

This bit...

line = line.split(' ')     

removes the spaces when it splits it into pieces. You'll need to add them back in:

line = line.split(' ')     
del line[0:6]
line = ' '.join(line)
line += "\n"
print line,
output.write(line)

Upvotes: 1

Alex
Alex

Reputation: 2420

If you look closely at your line of data you will see that it is a fragment of xml that is missing the leading '<'. If you add the '<' you will now have an 'sms' xml element.

>>> input = '<sms protocol="932" address="XXXXXXXXXX" date="1305655717379" type="1" subject="null" body="Talk to joey?" toa="null" sc_toa="null" service_center="null" read="1" status="-1" locked="0" date_sent="null" readable_date="May 17, 2011 2:08:37 PM" contact_name="David XXXX" />'

Now we can process this with something like ElementTree.

>>> import xml.etree.ElementTree as ET
>>> element = ET.fromstring(input)

Now you can access the tag's attributes as a friendly dictionary.

>>> element.attrib 
{'body': 'Talk to joey?', 'service_center': 'null', 'protocol': '932', 'read': '1', 'sc_toa': 'null', 'readable_date': 'May 17, 2011 2:08:37 PM', 'date': '1305655717379', 'status': '-1', 'address': 'XXXXXXXXXX', 'date_sent': 'null', 'locked': '0', 'contact_name': 'David XXXX', 'toa': 'null', 'type': '1', 'subject': 'null'}

Upvotes: 2

abarnert
abarnert

Reputation: 366123

The problem here is that you're calling output.writelines(line).

I'm not sure what you expect that to do when given a list of strings, but you can't have expected to print them out as words with spaces in between. Those words aren't separate lines, and you don't want them that way.

So, how do you join a list of words into a single string, with spaces separating the words? Using the join method:

' '.join(line)

And then, instead of using writelines (which expects multiple lines), just use write:

output.write(' '.join(line))

See the tutorial on Input and Output for the differences between write and writelines (and other things).

Upvotes: 0

Related Questions