dnisqa2 delta
dnisqa2 delta

Reputation: 161

How to search a string in a html file?

I'm a beginner in Python.
I'm trying to search two keyword Error and Report in a html file with if statement. For example, below is the html file which includes Error:

<HEAD><STYLE TYPE="text/css">
.MSG_OK      { color:white; }
.MSG_SUCCESS { color:green; }
.MSG_WARNING { color:yellow; }
.MSG_ERROR   { color:red; }
.MSG_DEBUG   { color:blue; }
body         { background-color:black; }
</STYLE></HEAD>
<body><pre>
<span class=MSG_OK>Reserving ports for the test
</span><span class=MSG_OK>ABC test...
</span><span class=MSG_ERROR>Error: xxx resource is already in use.
 Error with xxx....
</span><span class=font>(A)bort, (R)etry, (I)gnore?</span>

I used read() in file object, but it doesn't work. My code:

    html_path = "D:\\abcd.html"

    with open(html_path) as html_file:
        print(html_file.read())

        #for line in html_file.read():
        if "Error" in html_file.read():
            print("[error occur")
            html_file.close()

        elif "Report" in html_file.read():
            print("get result")
            html_file.close()
        else:
            print("[[[[nothing]]]]")

I always get the result:

<HEAD><STYLE TYPE="text/css">
.MSG_OK      { color:white; }
.MSG_SUCCESS { color:green; }
.MSG_WARNING { color:yellow; }
.MSG_ERROR   { color:red; }
.MSG_DEBUG   { color:blue; }
body         { background-color:black; }
</STYLE></HEAD>
<body><pre>
<span class=MSG_OK>Reserving ports for the test
</span><span class=MSG_OK>ABC test...
</span><span class=MSG_ERROR>Error: xxx resource is already in use.
 Error with xxx....
</span><span class=font>(A)bort, (R)etry, (I)gnore?</span>
[[[[nothing]]]]

Seems like 2 keywords Error and Report can't be found in my if statement. So I always get the result [[[[nothing]]]]. Does someone can correct my code and tell me the reason? Many thanks.

Upvotes: 3

Views: 3294

Answers (1)

Ignacio Vazquez-Abrams
Ignacio Vazquez-Abrams

Reputation: 798716

You only get to read a file once, unless you go back to the beginning. But this isn't how you want to do it regardless.

Iterate the file line-by-line, checking for your conditions.

error = False
report = False

with open(html_path) as html_file:
  for line in html_file:
    print(line)
    if 'Error' in line:
      error = True
    if 'Report' in line:
      report = True
    print(line)
  else:
    if error:
      print('error')
    elif report:
      print('result')
    else:
      print('nothing')

Upvotes: 2

Related Questions