Reputation: 1
I am fairly new to Python, so there may be a lot to improve upon, but in the following code I am trying to write a function that takes in the location of the data file, the attribute that has to be normalized and the type of normalization to be performed('min_max' or 'z_score')
After this, based on the normalization type that is mentioned, I want it to apply the appropriate formula and return a dictionary where key = original value in the dataset, value = normalized value.
def normalization (fname, attr, normType):
result = {
}
df = pd.read_csv(fname)
targ = list(df[df.columns[attr]])
scaler = MinMaxScaler()
df["minmax"] = scaler.fit.transform(df[[df.columns[attr]]])
df["zscore”] = ((df[[df.columns[attr]]]) - (df[[df.columns[attr.mean()]]]))/ (df[[df.columns[attr.std(ddof=1)]]])
if normType == "min_max":
result = dict(zip(targ, df.minmax.values.tolist())
else:
result = dict(zip(targ, df.zscore.values.tolist())
return result
I continually get an error specifically on the line with the zscore calculation and have been struggling to troubleshoot it. I would appreciate any help that could point me in the right direction. Thanks
Edit: Error message shown is "SyntaxError: EOL while scanning string literal"
Upvotes: 0
Views: 340
Reputation: 4510
"zscore”
alone causes that error. The problem is that the ”
isn't a proper double-quotes character so the string isn't properly terminated. Not sure how it got there, maybe bad formatting in a document while pasting code around. The fix: "zscore"
Upvotes: 1