Reputation: 30687
I'm trying to read all lines of a file that are either all upper or all lower case.
If file.txt
contains:
Rememberr 8? when you
. Tellingmee THAT one
didntrememberthat
onethingtoday
I would want it to read:
didntrememberthat
ONETHINGTODAY
So far I have:
def function(file_name):
data=[]
f=open(file_name,'r')
lines=f.readlines()
for k in lines:
single=tuple(k)
for j in single:
j=str(j)
if j.isupper() or j.islower() == False:
lines=f.readlines()[j:]
Then i get this:
lines=f.readlines()[j:]
TypeError: slice indices must be integers or None or have an __index__ method
which makes sense because j
isn't an integer. But how would i find the position of j
when it encounters the if
-statement?
If there is an easier way to do it that would be awesome
Upvotes: 0
Views: 2527
Reputation: 1058
def homogeneous_lines (file_name):
with open(file_name) as fd:
for line in fd:
if line.islower() or line.isupper():
yield line
This function reads through every line in the file. Then for every line checks to see if the line is all upper or lower case.
Finally we yield the line.
EDIT - Changed to incorporate using with statement, use builtin islower() and isupper() for strings, and made into a generator.
Upvotes: 2
Reputation: 36767
I'd use list comprehensions:
f=open(file_name,'r')
lines = f.readlines()
ul_lines = [line.rstrip('\n') for line in lines if line.islower() or line.isupper()]
You should tweak it if your file is unicode, but that's the general idea.
The rstrip part is to get rid of '\n' at the end.
Also, more memory efficient version
f=open(file_name,'r')
ul_lines = [line.rstrip('\n') for line in f if line.islower() or line.isupper()]
You'll have to reopen the file to repeat it though.
If you are very memory constrained you should use a generator expression:
f=open(file_name,'r')
ul_lines_gen = (line.rstrip('\n') for line in f if line.islower() or line.isupper())
And if you want only letters and no digits in your strings add line.rstrip('\n').isalpha() condition.
Upvotes: 1
Reputation: 185862
You are getting the error because j is a string, not an integer (you don't have to call str(j)
, by the way; it's already a string).
You can remove lines with a mix of upper- and lower-case like this:
all_one_case = [ line
for line in f.readlines()
if line.isupper() or line.islower() ]
Note: Credit for the use of isupper()
and islower()
(The original used re
.) goes to some other answers to this question.
This will also include, e.g., 10 green bottles
, since it only contains lowercase letters, even though it also contains digits and spaces. From the question, I can't tell whether this is the intent or not. If you want only letters, you can use this test instead:
… if re.match('[A-Z]*$|[a-z]*$', line) ]
If you want to replace the file with these lines, you can reopen it for writing:
with open(file_name, 'r') as f:
for line in all_one_case:
f.write(line)
Upvotes: 5
Reputation: 32522
use the with
statement for open
the file. This way you are be safe that the file will get close
d even in case of an exception.
Use the string methods islower
and isupper
to check whether the string is all upper or lower case. For example like this:
with open(filename) as f:
output = [line for line in f if line.isupper() or line.islower()]
Upvotes: 1
Reputation: 43096
f=open(file_name,'r')
print [l for l in f.readlines() if l.islower() or l.isupper()]
Upvotes: 1
Reputation: 54302
If in one line you can have only one symbol then I do not know why do you convert line (k
) into tuple, and 2nd call to f.readlines()
is probably bug. After 1st call of f.readlines()
you have all lines in lines
variable and in loop you can check it line by line.
If you want to check is whole string is lowercase or uppercase, then use such code:
if line.islower() or line.isupper():
print(line)
Upvotes: 1