Python Create Pandas Dataframe From txt file

Question

i have file.txt like this. i want create dataframe with pandas.

# NISN:- 1234567
# FullName:- Joe Doe
# FirstName:- Joe
# LastName:- Doe
# School:- Klima
# E-mail:- joe@gmail.com

# NISN:- 8901234
# FullName:- Jenny Low
# FirstName:- Jenny
# LastName:- Low
# School:- Kimcil
# E-mail:- jenny@gmail.com

how to convert dataframe to this?

NISN    Fullname    FirstName   LastName    School  E-mail
1234567 Joe Doe     Joe     Doe     Klima   joe@gmail.com
8901234 Jenny Low   Jenny       Low     Kimcil  jenny@gmail.com

edited

i found sample bad line in file. how to handle this?

# NISN:- 123456

7
# FullName:- Joe Doe
# FirstName:- Joe
# LastName:- Doe
# School:- Klima
# E-mail:- joe@gmail.com



# NISN:- 8901234
# FullName:- Jenny Low
# FirstName:- Jenny
# LastName:- Low
# School:- Kimc

il
# E-mail:- jenny@gmail.com

Cameron Riddell · Accepted Answer

You can iterate over the lines of your file in python and store the relevant data into a dictionary before converting it to a DataFrame

import pandas as pd
from collections import defaultdict

data = defaultdict(list)
with open("file.txt") as my_file:
    for line in my_file:
        line = line.strip("# 
")        # clean up whitespace and # for lines
        if not line:                     # skip empty lines
            continue

        name, value = line.split(":- ") 
        data[name].append(value)
    
df = pd.DataFrame.from_dict(data)

print(df)
      NISN   FullName FirstName LastName  School           E-mail
0  1234567    Joe Doe       Joe      Doe   Klima    joe@gmail.com
1  8901234  Jenny Low     Jenny      Low  Kimcil  jenny@gmail.com

Python Create Pandas Dataframe From txt file

Answers (2)

Related Questions