Santosh Kumar
Santosh Kumar

Reputation: 27875

How to capitalize some words in a text file?

I have a text file which have normal sentences. Actually I was in hurry while typing that file so I just capitalized the first letter of first word of the sentence (as per English grammar).

But now I want that it would be better if each word's first letter is capitalized. Something like:

Each Word of This Sentence is Capitalized

Point to be noted in above sentence is of and is are not capitalized, actually I want to escape the words which has equal to or less than 3 letters.

What should I do?

Upvotes: 4

Views: 3898

Answers (5)

Artur Gaspar
Artur Gaspar

Reputation: 4552

You should split the words, and capitalise only those which are longer than three letters.

words.txt:

each word of this sentence is capitalized
some more words
an other line

-

import string


with open('words.txt') as file:
    # List to store the capitalised lines.
    lines = []
    for line in file:
        # Split words by spaces.
        words = line.split(' ')
        for i, word in enumerate(words):
            if len(word.strip(string.punctuation + string.whitespace)) > 3:
                # Capitalise and replace words longer than 3 (without punctuation).
                words[i] = word.capitalize()
        # Join the capitalised words with spaces.
        lines.append(' '.join(words))
    # Join the capitalised lines.
    capitalised = ''.join(lines)

# Optionally, write the capitalised words back to the file.
with open('words.txt', 'w') as file:
    file.write(capitalised)

Upvotes: 3

Steven Rumbalski
Steven Rumbalski

Reputation: 45542

for line in text_file:
    print ' '.join(word.title() if len(word) > 3 else word for word in line.split())

Edit: To omit counting punctuation replace len with the following function:

def letterlen(s):
    return sum(c.isalpha() for c in s)

Upvotes: 5

inspectorG4dget
inspectorG4dget

Reputation: 113975

What you really want is something called a list of stop words. In the absence of this list, you can build one yourself and do this:

skipWords = set("of is".split())
punctuation = '.,<>{}][()\'"/\\?!@#$%^&*' # and any other punctuation that you want to strip out
answer = ""

with open('filepath') as f:
    for line in f:
        for word in line.split():
            for p in punctuation:
                # you end up losing the punctuation in the outpt. But this is easy to fix if you really care about it
                word = word.replace(p, '')  
            if word not in skipwords:
                answer += word.title() + " "
            else:
                answer += word + " "
    return answer # or you can write it to file continuously

Upvotes: 1

clwen
clwen

Reputation: 20909

Take a look at NLTK.

Tokenize each word, and capitalize it. Words such as 'if', 'of' are called 'stop words'. If your criteria is solely the length, Steven's answer is a good way of doing so. In case you want to look up stop words, there is a similar question in SO: How to remove stop words using nltk or python.

Upvotes: 4

Aaron Tp
Aaron Tp

Reputation: 345

You could add all the elements from the text file to a list:

list = []
f.open('textdocument'.txt)
for elm in f (or text document, I\'m too tired):
   list.append(elm)

And once you have all the elements in a list, run a for loop that checks each element's length, and if it's greater than three returns the first element upper-cased

new_list = []
for items in list:
   if len(item) > 3:
      item.title()    (might wanna check if this works in this case)
      new_list.append(item)
   else:
   new_list.append(item)    #doesn't change words smaller than three words, just adds them to the new list

And see if that works?

Upvotes: 0

Related Questions