Nick
Nick

Reputation: 67

How to create a table from .txt file?

I am trying to create e dataframe (table with three columns) from a .txt file.

I prepared the txt file so it has the format:

Car

Audi A4 10000

Audi A6 12000

....

Bus

VW Transporter 15000

...

Camper

VW California 20000

...

This is the whole code:

cars = ""
with open("cars.txt", "r", encoding = "utf-8") as f:
    cars = f.read()
print(cars)

def generate_car_table(table):
    table = pd.DataFrame(columns = ['category', 'model','price'])
    return table

cars_table = generate_car_table(cars)

I expect a table with three columns - category, which will show whether the vehicle is car/bus/camper, model and price.

Thank you in advance!

Upvotes: 0

Views: 8405

Answers (2)

MariusG
MariusG

Reputation: 86

Update:

Having your comments in mind, I see that I misunderstood your question.
If you're text-file (cars.txt) looks like follows:

Car
Audi A4         10000
Audi A6         12000

Bus
VW Transporter  15000

Camper
VW California   20000

so that after every category a line break is made and between the model and the price is a tab, you could run the following code:

# Read the file 
data = pd.read_csv('cars.txt', names=['Model','Price','Category'], sep='\t')

# Transform the unstructured data
data.loc[(data['Price'].isnull() == True), 'Category'] = data['Model']
data['Category'].fillna(method='ffill', inplace=True)
data.dropna(axis=0, subset=['Price'], inplace = True)

# Clean the dataframe
data.reset_index(drop=True, inplace=True)
data = data[['Category', 'Model', 'Price']]
print(data)

This does result in the following table:

  Category           Model    Price
0      Car         Audi A4  10000.0
1      Car         Audi A6  12000.0
2      Bus  VW Transporter  15000.0
3   Camper   VW California  20000.0

Old Answer:

Your text-file needs a fixed structure (for example all values are separated by a tabulate or a line break). Then you can use the pd.read_csv method and define the separator by hand with pd.read_csv('yourFileName', sep='yourseperator').

Tabs are \t and line breaks \n, for example.

The following cars.txt (link) for example is structured using tabs and can be read with:

import pandas as pd

pd.read_csv('cars.txt', sep = '\t')

Upvotes: 1

Nicholas Domenichini
Nicholas Domenichini

Reputation: 870

It is likely far easier to create a table from a CSV file than from a text file, as it will make the job of parsing much easier, and also provide the benefit of being easily viewed in table format in spreadsheet applications such as Excel.

You create the file so that it looks something like this

category,model,price
Car,Audi A4,10000
Car,Audi A6,12000
...

And then use the csv package to easily read/write the data into tabular formats

Upvotes: 0

Related Questions