Cec35
Cec35

Reputation: 23

Changing significance level of Mann Kendall test in Python

I need to calculate Mann Kendall for a large hydrological dataset(maximum discharge values across 4381 sub-basins). There are 70 max values for each sub-basin. I need a significance level of 0.1 rather than the default 0.05.

Here is what my data looks like:

"sub" "max"
1  2.195
1  3.753
1  2.941
1  2.152
1  3.363
...    ...
4381  0.532
4381  1.108
4381  0.977
4381  0.483
4381  0.435

And here's my script:

import pymannkendall as mk
import pandas as pd
from tqdm import tqdm

# naming input and output
in_fname = 'noyear.csv'
out_fname = 'mannkendall3.csv'

# reading csv file
print("reading from file...")
raw = pd.read_csv(in_fname, sep=';', header=0)


# naming columns and converting to strings
sub = raw['sub']
max = raw['max']
raw['sub'] = raw['sub'].astype(str)

# creating DataFrame
out_tbl = pd.DataFrame(data={'sub': sub, 'max': max})

# applying MK
df_mk=out_tbl.groupby(sub)['max'].agg(mk.original_test(alpha =0.1)).reset_index()

# creating csv with output

df_mk.to_csv(out_fname, index=False, sep=';')

When doing this, I get the following error:

Traceback (most recent call last):
  File "/Users/user/Desktop/PyCharmProject/mann-kendall3.py", line 24, in <module>
    df_mk=out_tbl.groupby(sub)['max'].agg(mk.original_test(alpha = 0.1)).reset_index()
TypeError: original_test() missing 1 required positional argument: 'x_old'

What is x_old, and where should it go in this case? I am a beginner so any tips would be appreciated!

Upvotes: 2

Views: 463

Answers (1)

Vitalizzare
Vitalizzare

Reputation: 7250

I suppose that we are talking about pyMannKendall, version 1.4.2

There's a slight discrepancy in the doc-line of original_test and the actual code. In the description returned by help(mk.original_test) we can see x as an input parameter:

    Input:
        x: a vector (list, numpy array or pandas series) data
        alpha: significance level (0.05 default)

But the function signature is def original_test(x_old, alpha = 0.05) (see the code at github). Here x_old is the input vector, which is mentioned as x in doc. This parameter is required and cannot be omitted when calling the function. Therefore the line where the Mann Kendall test is applied may need to be updated like this:

# applying MK
df_mk = out_tbl.groupby(sub)['max'].agg(lambda x: mk.original_test(x, alpha=0.1)).reset_index()

Or we can use partial to produce a new function which takes a vector as a single parameter:

from functools import partial
df_mk = out_tbl.groupby(sub)['max'].agg(partial(mk.original_test, alpha=0.1)).reset_index()

Upvotes: 1

Related Questions