Reputation: 35
I am using Pandas 0.22.0 on Python 2.7 with PyCharm as the IDE.
I am trying to sort multiple Data Frames using a Loop. These data frames were created from .csv files and then to be converted to xlsx using the 'xlsxwriter' in pandas.
I have created a sorting list that has all the sorting requirements inside it, so that when I run my loop, it will pick up a csv file, convert it to a data frame, 'sort it' (where I'm getting stuck), and then output the entire thing as a .xlsx file so it can be played around in MSEXCEL.
If I use df = df.sort_values(by=['SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'])
then there are no issues.
But, if I use this: df = df.sort_values(by=sorts[0])
, the code comes crashing down.
Traceback (most recent call last):
File "D:/OneDrive/Programming Practice/Python/Rubaiyat/test1.py", line 55, in <module>
df = df.sort_values(by=(sorts[0]))
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 3619, in sort_values
k = self.xs(by, axis=other_axis).values
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 2335, in xs
return self[key]
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2139, in __getitem__
return self._getitem_column(key)
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 2146, in _getitem_column
return self._get_item_cache(key)
File "C:\Python27\lib\site-packages\pandas\core\generic.py", line 1842, in _get_item_cache
values = self._data.get(item)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 3843, in get
loc = self.items.get_loc(item)
File "C:\Python27\lib\site-packages\pandas\core\indexes\base.py", line 2527, in get_loc
return self._engine.get_loc(self._maybe_cast_indexer(key))
File "pandas\_libs\index.pyx", line 117, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\index.pyx", line 139, in pandas._libs.index.IndexEngine.get_loc
File "pandas\_libs\hashtable_class_helper.pxi", line 1265, in pandas._libs.hashtable.PyObjectHashTable.get_item
File "pandas\_libs\hashtable_class_helper.pxi", line 1273, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: "'SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'"
The entire code is as follows:
import pandas
import sys
reload(sys)
sys.setdefaultencoding('utf-8')
reportDF = ["assetReport", "assetTypeReport", "assetStatusReport", "locationReport", "departmentReport", "siteReport",
"userReport"]
sheetNames = ["Asset Report", "Asset Types", "Asset Status", "Locations", "Cost Centers", "Sites", "Users"]
columnNames = [("EPC", "Creation Date", "Modification Date", "Inventory Date", "Asset Name", "Asset Status",
"Asset Type", "Asset User", "Location", "Site", "Cost Center", "Description"),
"Asset Type Name",
("Asset Status", "Asset Status Description"),
("Location Name", "EPC", "Floor", "GPS", "Capacity", "Lead Time", "Site Name"),
"Cost Center",
("Site", "Country", "Postal Code", "City", "Address", "GPS"),
("User Name", "User Role", "First Name", "Last Name", "Email", "User Disabled?")]
sorts = ["'SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'",
'ASSET_TYPE_NAME',
'ASSET_STATUS_NAME',
"'SITE_NAME', 'LOCATION_NAME'",
'DEPARTMENT_NAME',
'SITE_NAME',
'USER_NAME']
writer = pandas.ExcelWriter('mergedSheet.xlsx')
for i in range(0, 7):
df = pandas.read_csv(reportDF[i], delimiter=';')
df = df.sort_values(by=sorts[i])
df.to_excel(writer, sheet_name=sheetNames[i], engine='xlsxwriter', header=columnNames[i], freeze_panes=(1, 0))
writer.save()
writer.close()
Any help or guidance will be very much appreciated. Thank You.
Upvotes: 1
Views: 140
Reputation: 4052
you creating one string which is: "'SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'"
.
I think it should look like this:
sorts = [['SITE', 'DEPARTMENT', 'LOCATION', 'ASSET_TYPE', 'ASSET_NAME'],
'ASSET_TYPE_NAME',
'ASSET_STATUS_NAME',
['SITE_NAME', 'LOCATION_NAME'],
'DEPARTMENT_NAME',
'SITE_NAME',
'USER_NAME']
Upvotes: 1