How to format a list containing tags in python

Question

I have a list called tokens and would like to format this list so that when I print it, it is human readable.

The list:

tokens = ['','Hello','World','
','','Welcome','to','this','planet','']

What I would like the output to look like once formatted:

Heading: Hello World

Paragraph: Welcome to this planet

What I have tried so far:

I have first tried to replace the

and

tags so that when output it shows 'Heading: ' and 'Paragraph: ' instead. I used a FOR loop to loop through all the tokens and find the correct tags to replace:

for token in tokens:
# comparing strings
elif token == '':
   print(token.replace('', 'Heading: '))
elif token == '':
   print(token.replace('', 'Paragraph: '))

The next part I need to do is print out the sentences between the

tags and the

tags. For this I thought of creating a method, the general pseudo code is:

def between(tokens, tag, endTag)
    if token is between tag and endTag
        print the sentence

I don't really know how to get this method to work in python and have tried something like this:

def between(tokens, tag, endTag):
sentence = []
for token in tokens:
    if(token > tag and token < endTag):
        sentance.append(token)
return sentance

but I know the if statement does not make sense and does not work out overall. How can I solve this problem and format the list correctly?

DYZ · Accepted Answer

You can create a dictionary of human-readable tag names and replace a tag with its name. If a token is not a tag, it is not replaced.

tags = {"" : 'Heading1: ', "" : "
", 
        "" : "Paragraph: ", "" : "
", ... }
new_tokens = [tags.get(token.lower(),token) for token in tokens]
print("".join(new_tokens))
#Heading1: HelloWorld
#Paragraph: Welcometothisplanet

The .lower() function call makes the lookup case-insensitive.

How to format a list containing tags in python

Answers (2)

Related Questions