nikki
nikki

Reputation: 353

How to perform data cleaning for a text file?

I have a text file with lots of lines, including words and numbers, here is an example:

2021-12-06 05:07:09.266 INFO: Additional  ID 1638301749791
2021-12-06 05:07:09.266 INFO: Found 
2021-12-06 05:07:09.267 INFO: ObjectStatus-ok factor 1163 factor five and six computed as it was before best weight ID 1638301749796
2021-12-06 05:07:09.267 INFO: disabled; computing power weight factor factor 19025.
2021-12-06 05:07:10.041 INFO: Wrote big factor 0.3568357342, Classificationfactortype-fail
2021-12-06 05:07:10.042 DEBUG: Duiu.0.0.2588650814
2021-12-06 05:07:10.743 INFO: Wrote .3254806495

My question is how can I keep lines that have particular word"Classificationfactortype-fail" and "ObjectStatus-ok", and delete all other lines? I would like to save the new text file in the directory.

Here is the code that I wrote:

ans = []

with open('test. txt') as rf:
    for line in rf:
        line = line.strip()
        if "Classificationfactortype-fail" in line or "ObjectStatus-ok" in line:
          ans.append(line)

with open('extracted_data.txt', 'w') as wf:
    for line in ans:
        wf.write(line)

Upvotes: 0

Views: 128

Answers (1)

Cinematic Galaxy Ita
Cinematic Galaxy Ita

Reputation: 82

If each line starts with the timecode, then str.startswith() won't work.

You can simply do:

if "Classificationfactortype-fail" in line or "ObjectStatus-ok" in line:
   ans.append(line)

in your first loop.

Upvotes: 2

Related Questions