Reputation: 151
I'm using SQLAlchemy to query data from MSSQL db, then saving as excel file with pandas. I'm looking for something similar to T-SQL's RTRIM in order to remove any trailing white space from my data.
I know how to remove white space from the column headers, but not from the data itself. So I either need to remove the whitespace when querying or while its a pandas data frame, but I do not have any ideas as to how (since most searches retrieve how to remove white space when parsing, not writing data).
My code so far is:
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import scoped_session,sessionmaker
from sqlalchemy import (Column, Integer, String, Boolean, ForeignKey, DateTime, Sequence, Float)
from sqlalchemy import create_engine
import pandas as pd
import openpyxl
pd.core.format.header_style = None
pd.core.format.number_format = None
def data_frame(query, columns):
def make_row(x):
return dict([(c, getattr(x, c)) for c in columns])
return pd.DataFrame([make_row(x) for x in query])
engine = create_engine('mssql+pyodbc://u:pass@MyServer/MYDBt?driver=SQL Server', echo=False)
Session = sessionmaker(bind=engine)
session = Session()
Base = declarative_base()
class Tranv(Base):
__tablename__ = "Transactions"
part_number = Column(String(20), primary_key=True)
time_stamp = Column(String(20))
employee_number = Column(String(6))
action = Column(String(20))
newvarv = session.query(Tranv).filter_by(employee_number='001841').filter_by(time_stamp='2015-10-01 10:49:53.230')
dfx = data_frame(newvarv, [c.name for c in Tranv.__table__.columns])
dfx.columns = dfx.columns.str.strip()
dfx = dfx.rename(columns=lambda x: x.strip())
writer = pd.ExcelWriter('C:\\Users\\grice\\Desktop\\Auto_Scrap_Report\\testy.xlsx')
writer.date_format = None
writer.datetime_format = None
dfx.to_excel(writer, sheet_name='Sheet1', index=False)
writer.save()
Upvotes: 0
Views: 4602
Reputation: 2775
Ok there is posibly a more elegant way but this one worked for me:
In [2]:
df = pd.DataFrame(data={"names": ["John ", "Jack"], "surnames": ["Andrews", " McAllister"]})
In [3]:
df
Out[3]:
names surnames
0 John Andrews
1 Jack McAllister
2 rows × 2 columns
In [9]:
df = df.apply(lambda x: x.str.strip())
In [10]:
df.loc[0, "names"]
Out[10]:
'John'
Upvotes: 1