Reputation: 67
I am trying to create e dataframe (table with three columns) from a .txt file.
I prepared the txt file so it has the format:
Car
Audi A4 10000
Audi A6 12000
....
Bus
VW Transporter 15000
...
Camper
VW California 20000
...
This is the whole code:
cars = ""
with open("cars.txt", "r", encoding = "utf-8") as f:
cars = f.read()
print(cars)
def generate_car_table(table):
table = pd.DataFrame(columns = ['category', 'model','price'])
return table
cars_table = generate_car_table(cars)
I expect a table with three columns - category, which will show whether the vehicle is car/bus/camper, model and price.
Thank you in advance!
Upvotes: 0
Views: 8405
Reputation: 86
Having your comments in mind, I see that I misunderstood your question.
If you're text-file (cars.txt
) looks like follows:
Car
Audi A4 10000
Audi A6 12000
Bus
VW Transporter 15000
Camper
VW California 20000
so that after every category
a line break is made and between the model
and the price
is a tab, you could run the following code:
# Read the file
data = pd.read_csv('cars.txt', names=['Model','Price','Category'], sep='\t')
# Transform the unstructured data
data.loc[(data['Price'].isnull() == True), 'Category'] = data['Model']
data['Category'].fillna(method='ffill', inplace=True)
data.dropna(axis=0, subset=['Price'], inplace = True)
# Clean the dataframe
data.reset_index(drop=True, inplace=True)
data = data[['Category', 'Model', 'Price']]
print(data)
This does result in the following table:
Category Model Price
0 Car Audi A4 10000.0
1 Car Audi A6 12000.0
2 Bus VW Transporter 15000.0
3 Camper VW California 20000.0
Your text-file needs a fixed structure (for example all values are separated by a tabulate or a line break).
Then you can use the pd.read_csv
method and define the separator by hand with pd.read_csv('yourFileName', sep='yourseperator')
.
Tabs are \t
and line breaks \n
, for example.
The following cars.txt
(link) for example is structured using tabs and can be read with:
import pandas as pd
pd.read_csv('cars.txt', sep = '\t')
Upvotes: 1
Reputation: 870
It is likely far easier to create a table from a CSV file than from a text file, as it will make the job of parsing much easier, and also provide the benefit of being easily viewed in table format in spreadsheet applications such as Excel.
You create the file so that it looks something like this
category,model,price
Car,Audi A4,10000
Car,Audi A6,12000
...
And then use the csv
package to easily read/write the data into tabular formats
Upvotes: 0