Reputation: 1539
I am trying to output statistics about a table, followed by more table data using Pandas and numpy.
When I execute the following code:
import pandas as pd
import numpy as np
data = pd.read_csv(r'c:\Documents\DS\CAStateBuildingMetrics.csv')
waterUsage = data["Water Use (All Water Sources) (kgal)"]
dept = data[["Department Name", "Property Id"]]
mean = str(waterUsage.mean())
median = str(waterUsage.median())
most = str(waterUsage.mode())
hw1 = open(r'c:\Documents\DS\testFile', "a")
hw1.write("Mean Water Usage Median Water Usage Most Common Usage Amounts\n")
hw1.write(mean+' '+median+' '+most)
np.savetxt(r'c:\Documents\DS\testFile', dept.values, fmt='%s')
The table output by np.savetext is written into c:\Documents\DS\testFile
before the statistics about Mean, Median, and Mode water usage are written into the file. Below is the output I am describing:
Here is a sample of the table output, which ends up to be 1700 rows.
Capitol Area Development Authority 1259182
Capitol Area Development Authority 1259200
Capitol Area Development Authority 1259218
California Department of Forestry and Fire Protection 3939905
California Department of Forestry and Fire Protection 3939906
California Department of Forestry and Fire Protection 3939907
After this, the script outputs the statistics in this format
Mean Water Usage Median Water Usage Most Common Usage Amounts
6913.1633414932685 182.35 0 165.0
Type: float64
How do I adjust the behavior to guarantee that the statistics appear before the table?
Upvotes: 0
Views: 32
Reputation: 1539
The issue, as pointed out by @hpaulj, is that the same open file is not being referenced.
Replacing
np.savetxt(r'c:\Documents\DS\testFile', dept.values, fmt='%s')
With
np.savetxt(hw1, dept.values, fmt='%s')
hw1.close()
Will write all information in the expected order in the same file. Closing it follows best practices of handling files in Python.
Upvotes: 1