Reputation: 383
Using Python3 my requirement is to read email files from a directory and filter Html tags in it.
I have managed to do it to a large extent.When I try to read the content of my output, it gives an error
for line in output.splitlines():
AttributeError: 'int' object has no attribute 'splitlines'
for file in glob.glob('spam/*.*'):
output = os.system("python html2txt.py " + file)
for line in output.splitlines():
print(line)
When I print output, it shows a filtered text.Any help is appreciated.
Upvotes: 0
Views: 79
Reputation: 2282
On Unix, the return value is the exit status of the process encoded in the format specified for wait(). Note that POSIX does not specify the meaning of the return value of the C system() function, so the return value of the Python function is system-dependent.
On Windows, the return value is that returned by the system shell after running command. The shell is given by the Windows environment variable COMSPEC: it is usually cmd.exe, which returns the exit status of the command run; on systems using a non-native shell, consult your shell documentation. python docs
So your output
variable is a integer not the result of the file being parsed by the
html2txt.py script.
And why do you run another python script outside of your current process ? Can't you just import whatever class of function that is doing the job from that module ?
Also there is an email module that can help you
Upvotes: 0
Reputation: 5993
The return value of os.system(command)
is system-dependent, it supposes to return the (encoded) process exit value which represented by an int
. read more here
On Unix, the return value is the exit status of the process encoded in the format specified for wait(). Note that POSIX does not specify the meaning of the return value of the C system() function, so the return value of the Python function is system-dependent.
On Windows, the return value is that returned by the system shell after running command, given by the Windows environment variable COMSPEC: on command.com systems (Windows 95, 98 and ME) this is always 0; on cmd.exe systems (Windows NT, 2000 and XP) this is the exit status of the command run; on systems using a non-native shell, consult your shell documentation.
But in no system it returns a str
and the method splitlines()
is a str method. read more here
You are calling a str
method on a int
that is why you get the error:
AttributeError: 'int' object has no attribute 'splitlines'
Upvotes: 0
Reputation: 690
Try this as a replacement for the code you've provided:
import glob
files = glob.glob('spam/*.*')
for f in files:
with open(f) as spam_file:
for line in spam_file:
print(line)
If the files are indeed html files, I would recommend looking into BeautifulSoup.
Upvotes: 1