Reputation: 370
I have workout log data in the CSV format of shape:
Just to give a little context on the data, the rows are not indexable by Date because the date column contains duplicates for each named workout. What I'm trying to do is to filter rows from the start of my training cycle (15/08/2022) to perform some analysis. How do I go about doing that in Deedle using C#?
using Deedle;
using System.Net.Sockets;
using System.Runtime.CompilerServices;
using System.Security.Cryptography.X509Certificates;
namespace Volume_Analysis_Tool
{
public class Program
{
static void Main(string[] args)
{
Frame<int, string> workouts = Frame.ReadCsv(Path.Combine(Environment.CurrentDirectory, "./data/strong218049301181757984.csv"),
hasHeaders: true,
separators: ";",
inferTypes: true);
var startOfMeso = workouts.GroupRowsBy<DateTime>("Date");
startOfMeso.Rows.After(new Tuple<DateTime, int>(new DateTime(day: 15, month: 8, year: 2022), 1));
startOfMeso.Print();
}
}
}
So far I got to here, but the issue is that when I try to filter after, I actually have to supply an index.
Upvotes: -1
Views: 83
Reputation: 243106
I think your approach is generally good. If you use GroupRowsBy
, the index will be a pair of date together with an index (of the item within the group). The index will be sorted, so the After
operation will work efficiently.
You just need to supply a date together with an index. If you want all items after a given day, you can use the day together with Int32.MaxValue
:
using System;
using Deedle;
var frame = Frame.FromRecords(new[] {
new { Date=new DateTime(2022,1,1), Workout=13 },
new { Date=new DateTime(2022,1,2), Workout=4 },
new { Date=new DateTime(2022,1,2), Workout=19 },
new { Date=new DateTime(2022,1,3), Workout=1 },
});
var byDate = frame.GroupRowsBy<DateTime>("Date");
// Get all items after 1/1/2022 (excluding any on 1/1/2022)
var afterJan1 =
byDate.Rows.After(Tuple.Create(new DateTime(2022, 1, 1),
Int32.MaxValue));
afterJan1.Print();
Upvotes: 0