mamcx
mamcx

Reputation: 16186

How implement the relational JOIN in pure F#?

I'm building a relational language/library that is similar to j/kdb+/tutorialD in spirit.

The idea is that the end-user will: Load data (in memory, from anywhere), Perform queries, do join(s), agregates, etc. So, is like to have LINQ everywhere, as a first-class citizen of the language.

A relation is composed of:

1- A heading with a list of pair of name * type 2- A body with a list of rows, where each row must conform with the heading

So, is like a table.

Something like this:

type ExprC =
    | BoolC of bool
    | DecC of decimal
    | Str16C of string #And others!
and
  ColumnC = {name:string; colType:ExprC}
and
  HeaderC = ColumnC array
and
  RelC = {header:array<string>; data:array<array<ExprC>>}

So, I need to implement the full relational operators. PROJECTION (select) & RESTRICTION (where) are easy but JOIN look tricky. This is because I need to traverse both columns (and ordering them?) to compute the join, then remove the duplicate column used for the join, then yield the result.

Do each thing separately is kind of easy, but I think is a lot of waste potential. I found that is claimed that merge-sort-join is the way to go, but don't know how apply to this situation where I'm interpreting.

I could change anything currently.. my only requirement is that the solution not hinder badly the implementation of the full relational model and the performance is "ok".

Upvotes: 0

Views: 194

Answers (1)

Roman Dibikhin
Roman Dibikhin

Reputation: 856

Jon Skeet has reimplemented LINQ to Objects using C#, it can inspire you. Here is a detailed description of JOIN (inner join) implementation for two sequences: Reimplementing LINQ to Objects: Part 19 – Join.

Upvotes: 1

Related Questions