gvak6421
gvak6421

Reputation: 1

How to read specific columns in an xlsb in Python

I'm trying to read spreadsheets in an xlsb file in python and I've used to code below to do so. I found the code in stack overflow and I'm sure that it reads every single column in a row of a spreadsheet and appends it to a dataframe. How can I modify this code so that it only reads/appends specific columns of the spreadsheet i.e. I only want to append data in columns B through D into my dataframe.

Any help would be appreciated.

import pandas as pd
from pyxlsb import open_workbook as open_xlsb

df = []

with open_xlsb('some.xlsb') as wb:
    with wb.get_sheet(1) as sheet:
        for row in sheet.rows():
            df.append([item.v for item in row])

df = pd.DataFrame(df[1:], columns=df[0])

Upvotes: 0

Views: 3801

Answers (2)

JPS
JPS

Reputation: 11

Just do:

import pandas as pd
from pyxlsb import open_workbook as open_xlsb

df = []

with open_xlsb('some.xlsb') as wb:
    with wb.get_sheet(1) as sheet:
        for row in sheet.rows():
            df.append([item.v for item in row if item.c > 0 and item.c < 4])

df = pd.DataFrame(df[1:], columns=df[0])

item.c refers to the column number starting at 0

Upvotes: 1

Katekarin
Katekarin

Reputation: 304

pyxlsb itself cannot do it, but it is doable with the help of xlwings.

import pandas as pd
import xlwings as xw
from pyxlsb import open_workbook as open_xlsb

with open_xlsb(r"W:\path\filename.xlsb") as wb:
    Data=xw.Range('B:D').value

#Creates a dataframe using the first list of elements as columns        
Data_df = pd.DataFrame(Data[1:], columns=Data[0])

Upvotes: 1

Related Questions