Reputation: 71
I posted part of this question couple of days ago with a good answer but that solved just part of my problem. So, I have a excel file on which needs to be done some data mining and after that needs to get out another excel file with the same format .xlsx The problem is that I get a strange column after I write the file, which cannot be seen before the writing using Anaconda. And that makes it harder to develop a strategy to counter it's appearance. initially I though I solved the problem by reducing the width to 0 but apparently at some point the file needs to be converted in text and then the columns reappears. For more details here is part of my code:
import os
import pandas as pd
import numpy as np
import xlsxwriter
# Retrieve current working directory (`cwd`)
cwd = os.getcwd()
cwd
# Change directory
os.chdir("/Users/s7c/Documents/partsstop")
# Assign spreadsheet filename to `file`
file = 'file = 'SC daily inventory retrieval columns for reports'.xlsx
# Load spreadsheet
xl = pd.ExcelFile(file)
# Load a sheet into a DataFrame by name: df
df = xl.parse('Sheet1')
#second file code:
#select just the columns we need and rename them:
df2 = df.iloc[:, [1, 3, 6, 9]]
df2.columns = ['Manufacturer Code', 'Part Number', 'Qty Available', 'List Price']
#then select just the rows we need:
df21 = df2[df2['Manufacturer Code'].str.contains("DRP")]#13837 entries
#select just the DRP, first 3 characters and dropping the ones after:
df21['Manufacturer Code'] = df21['Manufacturer Code'].str[:3]
#add a new column:
#in order to do that we need to convert the next column to numeric:
df21['List Price'] = pd.to_numeric(df21['List Price'], errors='coerce')
df21['Dealer Price'] = df21['List Price'].apply(lambda x: x*0.48) #new column equals half of other column
writer = pd.ExcelWriter('example2.xlsx', engine='xlsxwriter')
# Write your DataFrames to a file
df21.to_excel(writer, 'Sheet1')
The actual view of the problem:
Any constructive idea is appreciated. Thanks!
Upvotes: 2
Views: 1838
Reputation: 3461
This column seems to be the index of your DataFrame. You can exclude it by passing index=False
to to_excel().
Upvotes: 4