jfleet_uk
jfleet_uk

Reputation: 1

Splitting up a CSV into separate excel files based on a columns values, then change formatting of excel before saving

Hi I am vey new to python but have been tasked with creating a tool that does the following: 1) opens a csv file 2) splits the data frame up by the values of a single column 3) It will then save those groupings to individual excel workbooks and manipulate the formatting (may add a chart to one of the worksheets based on the newly added Data)

I have found this code, which groups and saves to csv. I can change to the excel format, but I’m really struggling I do the formatting and chart bit. Any help would be very appreciated.

gp = df.groupby('CloneID')
for g in gp.groups:
    path = 'CloneID' + str(g) + '.txt'
    gp.get_group(g).to_csv(path)

Upvotes: 0

Views: 132

Answers (1)

Richard Stoeffel
Richard Stoeffel

Reputation: 695

One easy way to create nicely formatted excel sheets is to pre-format a template, and use openpyxl to fill in the rows as you need.

At a high level, your project should include a template, which will be an xlsx file (excel). If you named your project my_project, for example, the structure of your project should look like this:

my_project
--__init__.py
--templates
----formated_excel.xlsx
--main.py

where templates is a directory, formatted_excel is an xlsx file, and main.py is your code.

In main.py, the basic logic of your code would work like this:

import os
import openpyxl

TEMPLATE = os.path.join(os.path.dirname(os.path.abspath(__file__)),
                                    'templates', 'formated_excel.xlsx')

wb = openpyxl.load_workbook(TEMPLATE)
# to use wb[VALUE], your template must have a sheet called VALUE
data_sheet = wb['Data'] 

# have enumerate start at 2, as in most cases row 1 of a sheet 
# is the header
for row, value in enumerate(data, start=2): 
    data_sheet[f'A{row}'] = value

wb.save('my_output.xlsx')

This example is a very, very basic explanation of how to use openpyxl.

Note that I've assumed you are using python3, if not, you'll have to use the appropriate string formatting when setting the data_sheet row that you are writing to. Openpyxl also has Chart Support, which you can read up on to hep you in formatting your chart.

You haven't provided much detail in exactly what you want to do or the data you are using, so you will have to extend this example to fit your dataset.

Upvotes: 1

Related Questions