user15070504
user15070504

Reputation: 19

Why does df.shift() not work when using modin?

In the following example code I am trying to use the df.shift() function which pandas normally executes flawlessly. However, when using modin, the .shift() function ceases to work. Is there any way to fix this?

import modin.pandas as pd
import ray
ray.init(runtime_env={'env_vars': {'__MODIN_AUTOIMPORT_PANDAS__': '1'}})

df = pd.read_csv('dataframe.csv')

df['test'] = 1
df['shift'] = df['test'].shift()

ValueError: Length mismatch: Expected axis has 2 elements, new values have 604402 elements

Upvotes: 0

Views: 140

Answers (2)

Karthik Velayutham
Karthik Velayutham

Reputation: 21

Following up on Mahesh's comment (since I don't have enough reputation to add a comment), a fix has now been merged into main: https://github.com/modin-project/modin/pull/5823

Upvotes: 0

Mahesh Vashishtha
Mahesh Vashishtha

Reputation: 176

This is a known bug in Modin. To work around the bug, you can use modin's _to_pandas method, apply the method on the pandas dataframe, then convert the result back to Modin, e.g.:

import modin.pandas as pd
import numpy as np

df = pd.DataFrame(np.arange(16).reshape(-1,4), columns=list(range(4)), index=[0,1,2,0])
pandas_df = df._to_pandas()
pandas_result = pandas_df[0].shift(1)
modin_result = pd.Series(pandas_result)
print(modin_result)

Upvotes: 0

Related Questions