Reputation: 9527
Let's suppose we have the following in Python with pandas:
import pandas as pd
df = pd.DataFrame({
"Col1": [10, 20, 15, 30, 45],
"Col2": [13, 23, 18, 33, 48],
"Col3": [17, 27, 22, 37, 52] },
index=pd.date_range("2020-01-01", "2020-01-05"))
df
Here's what we get in Jupyter:
Now let's shift
Col1
by 2 and store it in Col4
.
We'll also store df['Col1'] / df['Col1'].shift(2)
in Col5
:
df_2 = df.copy(deep=True)
df_2['Col4'] = df['Col1'].shift(2)
df_2['Col5'] = df['Col1'] / df['Col1'].shift(2)
df_2
The result:
Now let's setup a similar DataFrame
in C#:
#r "nuget:Microsoft.Data.Analysis"
using Microsoft.Data.Analysis;
var df = new DataFrame(
new PrimitiveDataFrameColumn<DateTime>("DateTime",
Enumerable.Range(0, 5).Select(i => new DateTime(2020, 1, 1).Add(new TimeSpan(i, 0, 0, 0)))),
new PrimitiveDataFrameColumn<int>("Col1", new []{ 10, 20, 15, 30, 45 }),
new PrimitiveDataFrameColumn<int>("Col2", new []{ 13, 23, 18, 33, 48 }),
new PrimitiveDataFrameColumn<int>("Col3", new []{ 17, 27, 22, 37, 52 })
);
df
The result in .NET Interactive:
What's a good way to perform the equivalent column shifts as demonstrated in the Pandas version?
The above example is from the documentation for pandas.DataFrame.shift
:
https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.shift.html
It does indeed look like there isn't currently a built-in shift
in Microsoft.Data.Analysis
. I've posted an issue for this here:
https://github.com/dotnet/machinelearning/issues/6008
Upvotes: 1
Views: 494
Reputation: 141
@dharmatech has a great answer and it should be marked as the correct answer.
I changed the function slightly to make it generic:
using Microsoft.Data.Analysis;
PrimitiveDataFrameColumn<T> ShiftIntColumn<T>(PrimitiveDataFrameColumn<T> col, int n, string name) where T : unmanaged
{
return
new PrimitiveDataFrameColumn<T>(
name,
Enumerable.Repeat((T?) null, n)
.Concat(col.Select(item => (T?) item))
.Take(col.Count()));
}
Upvotes: 1
Reputation: 9527
Perform a column shift.
PrimitiveDataFrameColumn<double> ShiftIntColumn(PrimitiveDataFrameColumn<int> col, int n, string name)
{
return
new PrimitiveDataFrameColumn<double>(
name,
Enumerable.Repeat((double?) null, n)
.Concat(col.Select(item => (double?) item))
.Take(col.Count()));
}
Carry out division, taking care of null
values in divisor.
PrimitiveDataFrameColumn<double> DivAlt3(PrimitiveDataFrameColumn<int> a, PrimitiveDataFrameColumn<double> b, string name)
{
return
new PrimitiveDataFrameColumn<double>(name, a.Zip(b, (x, y) => y == null ? null : x / y));
}
Then the following:
var df = new DataFrame(
new PrimitiveDataFrameColumn<DateTime>("DateTime",
Enumerable.Range(0, 5).Select(i =>
new DateTime(2020, 1, 1).Add(new TimeSpan(i, 0, 0, 0)))),
new PrimitiveDataFrameColumn<int>("Col1", new []{ 10, 20, 15, 30, 45 }),
new PrimitiveDataFrameColumn<int>("Col2", new []{ 13, 23, 18, 33, 48 }),
new PrimitiveDataFrameColumn<int>("Col3", new []{ 17, 27, 22, 37, 52 })
);
df.Columns.Add(ShiftIntColumn((PrimitiveDataFrameColumn<int>)df["Col1"], 2, "Col4"));
df.Columns.Add(DivAlt3((PrimitiveDataFrameColumn<int>) df["Col1"], (PrimitiveDataFrameColumn<double>) df["Col4"], "Col5"));
results in:
See the following notebook for a full demonstration of the above:
https://github.com/dharmatech/dataframe-shift-example-cs/blob/003/dataframe-shift-example-cs.ipynb
Microsoft.Data.Analysis
came with column shift
functionality.Would love to see other perhaps more idiomatic approaches to this.
Upvotes: 1