Reputation: 246
I want to be able to get the length of each line of a java file using python ignoring spaces and white space. I would eventually put each line length into an array. Take this java file:
public class HelloWorld {
public static void main(String[] args) {
System.out.println("Hello world");
}
}
The array for this file would read:
{22,34,33,1,1,}
what is the best way to go about this? I am more than capable of creating the array but how can I go about finding the length of each line?
Upvotes: 0
Views: 226
Reputation: 129
This should work :
with open('input') as f:
output = []
for line in f:
line = line.split()
if line != [] :
line = "".join(line)
output.append(len(line))
print output
Upvotes: 1
Reputation: 122059
The {a, b, c}
notation in Python is a set
, which you don't want (no duplicate items allowed); try a list
, [a, b, c]
.
To remove whitespace from the start and end of the lines, use str.strip()
; this will remove tabs/spaces/newlines at the start and end of the line. To remove spaces from inside the line, use str.replace(' ', '')
. Once you have stripped the extra characters, the length of the line is simply len(line)
.
You can use a list comprehension to create the list in one step, for file f
:
output = [len(l) for l in (line.strip().replace(' ', '')
for line in f) if len(l) > 0]
gives me [22, 35, 33, 1, 1]
.
Upvotes: 2
Reputation: 3417
Focusing on the part of your question "how can I go about finding the length of each line?", you can use this code.
bad_chars = ' \t\n\r'
def count_chars():
with open('someclass.java', 'r') as javafile:
for line in javafile:
cleaned = filter(lambda c: c not in bad_chars, line)
if cleaned:
yield len(cleaned)
You definitely can and should refactor this to meet your needs (perhaps having the java filename as a function argument), but this should give you non-whitespace counts, as defined by not including characters in bad_chars
.
RETURNS:
>>> print list(count_chars())
[22, 35, 33, 1, 1]
Upvotes: 2
Reputation: 1490
Do something like this:
for line in open('file.java', 'r'):
lineLength = len(line.replace(' ', ''))
Upvotes: 1