Reputation: 37
Afternoon All,
I am extracting data from SQL server to a csv format then reading the file in.
df = pd.read_csv(
'TKY_RFQs.csv',
sep='~',
usecols=[
0,1,2,3,4,5,6,7,8,9,
10,11,12,13,14,15,16,17,18,19,
20,21,22,23,24,25,26,27,28,29,
30,31,32,33,34,35,36,37
]
)
There is a blank row then the record count at the end of the file which I would like to remove.
I have been getting around the issue via this code but would like to resolve the root problem:
# Count_Row=df.shape[0] # gives number of row count
# df_Sample = df[['trading_book','state', 'rfq_num_of_dealers']].head(Count_Row-1)
Is there a way to exclude the last two rows in the file or alternativcely remove any row which has null values for all columns?
Pete
Upvotes: 0
Views: 4417
Reputation: 12406
You can make use of skiprows
directly in .read_csv
df = pd.read_csv(
'TKY_RFQs.csv',
sep='~',
usecols=[
0,1,2,3,4,5,6,7,8,9,
10,11,12,13,14,15,16,17,18,19,
20,21,22,23,24,25,26,27,28,29,
30,31,32,33,34,35,36,37
],
skiprows=-2 # added this line to skip rows when reading
)
Upvotes: 2
Reputation: 840
Could you try :
df = pd.read_csv(
'TKY_RFQs.csv',
sep='~',
usecols=[
0,1,2,3,4,5,6,7,8,9,
10,11,12,13,14,15,16,17,18,19,
20,21,22,23,24,25,26,27,28,29,
30,31,32,33,34,35,36,37
]
)[:-2]
Example:
from pandas import read_csv
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data"
names = ['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
data = read_csv(url, names=names)[:-2] #to exclude last two rows
#data = read_csv(url, names=names) #to include all rows
print data
#description = data.describe()
Upvotes: 2