Reputation: 1
I was having some trouble fully utilizing panda for a work project. In short I have to take all the data from excel and format it in a way another device can properly read. I have already shoved all the columns that I need into a data buffer, however I need to check one of the data buffer and print a different command ( IE in that column it says healthy: I would need to print HEAL first in the output file, if it says unhealthy i would need to print UN however if it says anything else I would need to delete the row from data, Also healthy/ unhealthy wont be there exclusively they may have other words but the key part I am looking for is if those pieces are contained.) I will attach example excel/output text that I am looking at.
So far in my code I have isolated the columns I want and have skipped the extra white space rows that the excel file will have and I also terminated with a semicolon and new line character.
import pandas as pd
import numpy as np
#file_name = input("Please input a file to read. It should have a name like File.xlsm\n")
file_name = "file.xlsm"
# maybe add a part where if it fails you ask the user again
read_file = pd.read_excel(file_name, sheet_name = 0, header = 0, index_col = 0, usecols = [" Name", "Comment", "Price", "category", "data to change"], skiprows = 15) # sheet is equal to 0 by default os it will do the first one
#print("\n")
#print(read_file)
# search fe
#Fruit Name | Comment | Price | Category | Aisle# / data
#for index, row in read_file.iterrows(): trying to find if healthy or unhealthy or to remove row
# if cell = Dgiit\
#read_file["Fruit Name"] = read_file["Fruit Name"].str.lower() #broken. tring to get name in to paranthees and all lower case. APPLE -> "apple"
#drop_val = #!digital / supply
#read_file = read_file[~read_file['A'].isin(drop_val)] ! ( unhealty * | *Healthy )
# saving to a text file
read_file.to_csv('input2.txt', sep = '\t', line_terminator = ';\n') # saves data frame to tab seperated text file. need to find out how to have semi colons at the end.
After I check whether the item is on of two wanted categories ( everything besides the wanted categories need to be row removed ) I need to make the first column be a command.
Heres an example of the final output
HEALTHY "bannana" "Aisle#-storename" ; // the comment I need from the comment box //(the number comes from data that needs to be manipulated tab, it has some exess info and things i need to conver)
HEALTHY "orange" "Aisle#-storename"; // what came first the color or the fruit. is the fruit named after the color or the color after the fruit
UNHEALTHY "cupcake" "Aisle#-storename"; // not good for you but maybe for the sould
UNHEALTHY "pizza" "Aisle#-storename";
Here is what I am getting
Name Comment Price Category Data;
BANANNA x x x x ;
APPLE x x x x;
ORANGE x x x x ;
in the text file its not quite aligned and id like for it to be more structed. It has to a text file because the machine reads a text
My biggest question is how do I read the the 2nd to last category on the right check whether to remove the row and print some command on the LEFT most space of the text file.
( also i need to do price on teh second pass for the items i care about, i have to generate a separate part of the file. )
Also for data that needs to change I have to read the first number after some IE SHELF323 the first 3 needs to be put in the table that i know and converted to a physical address and the 23 is like the number of rows the shelf has. and these need to get printed to the final txt in a format.
LMK if i can clarify anything. My python skills aren't amazing but I am trying to get this done.
The purpose of this is to automate teh reading of an excel file and convert it to a txt file that can be read by a specific machine.
Upvotes: 0
Views: 2454
Reputation: 341
Try this code see if it works. Below code will convert each of your Excel tab to txt file with a | delimited.
import pandas as pd
sheets_dict = pd.read_excel(r'C:\my_file.xlsx', sheet_name=None)
ExcelSheet = pd.DataFrame()
for name, sheet in sheets_dict.items():
sheet.to_csv (name+'.txt', '|', index = None, header=True)
Upvotes: 1