Reputation: 276
Can anyone help me to figure out how to do this in fp-ts
?
const $ = cheerio.load('some text');
const tests = $('table tr').get()
.map(row => $(row).find('a'))
.map(link => link.attr('data-test') ? link.attr('data-test') : null)
.filter(v => v != null);
I can do it all with TaskEither
but I don't know how to mix it with IO
, or maybe I shouldn't use IO
at all?
This is what I came up with so far:
const selectr = (a: CheerioStatic): CheerioSelector => (s: any, c?: any, r?: any) => a(s, c, r);
const getElementText = (text: string) => {
return pipe(
IO.of(cheerio.load),
IO.ap(IO.of(text)),
IO.map(selectr),
IO.map(x => x('table tr')),
// ?? don't know what to do here
);
}
I must mention and clarify the most challenging part for me is how to change typings from IO
to an array of Either
and then filter or ignore the left
s and continue with Task
or TaskEither
TypeScript error is Type 'Either<Error, string[]>' is not assignable to type 'IO<unknown>'
const getAttr = (attrName: string) => (el: Cheerio): Either<Error, string> => {
const value = el.attr(attrName);
return value ? Either.right(value) : Either.left(new Error('Empty attribute!'));
}
const getTests = (text: string) => {
const $ = cheerio.load(text);
return pipe(
$('table tbody'),
getIO,
// How to go from IO<string> to IOEither<unknown, string[]> or something similar?
// What happens to the array of errors do we keep them or we just change the typings?
IO.chain(rows => A.array.traverse(E.either)(rows, flow($, attrIO('data-test)))),
);
Upvotes: 1
Views: 2819
Reputation: 1380
If you want to do it "properly", then you need to wrap all non-deterministic (non-pure) function calls in IO or IOEither (depending on whether they can or cannot fail).
So first let's define which function calls are "pure" and which are not. The easiest I find to think of it is like so - if function ALWAYS gives the same output for the same input and doesn't cause any observable side-effects, then it's pure.
"Same output" doesn't mean referential equality, it means structural/behaviour equality. So if your function returns another function, this returned function might not be the same function object, but it must behave the same (for the original function to be considered pure).
So in these terms, the following is true:
cherio.load
is pure$
is pure.get
is not pure.find
is not pure.attr
is not pure.map
is pure.filter
is pureNow let's create wrappers for all non-pure function calls:
const getIO = selection => IO.of(selection.get())
const findIO = (...args) => selection => IO.of(selection.find(...args))
const attrIO = (...args) => element => IO.of(element.attr(...args))
One thing to note is that here we apply non-pure function (.attr
or attrIO
in a wrapped version) on an array of elements. If we just map attrIO
on the array, we get back Array<IO<result>>
, but it's not super useful, we want IO<Array<result>>
instead. To achieve this, we need traverse
instead of map
https://gcanti.github.io/fp-ts/modules/Traversable.ts.html.
So if you have an array rows
and you want to apply attrIO
on it, you do it like so:
import { array } from 'fp-ts/lib/Array';
import { io } from 'fp-ts/lib/IO';
const rows: Array<...> = ...;
// normal map
const mapped: Array<IO<...>> = rows.map(attrIO('data-test'));
// same result as above `mapped`, but in fp-ts way instead of native array map
const mappedFpTs: Array<IO<...>> = array.map(rows, attrIO('data-test'));
// now applying traverse instead of map to "flip" the `IO` with `Array` in the type signature
const result: IO<Array<...>> = array.traverse(io)(rows, attrIO('data-test'));
Then just assemble everything together:
import { array } from 'fp-ts/lib/Array';
import { io } from 'fp-ts/lib/IO';
import { flow } from 'fp-ts/lib/function';
const getIO = selection => IO.of(selection.get())
const findIO = (...args) => selection => IO.of(selection.find(...args))
const attrIO = (...args) => element => IO.of(element.attr(...args))
const getTests = (text: string) => {
const $ = cheerio.load(text);
return pipe(
$('table tr'),
getIO,
IO.chain(rows => array.traverse(io)(rows, flow($, findIO('a')))),
IO.chain(links => array.traverse(io)(links, flow(
attrIO('data-test'),
IO.map(a => a ? a : null)
))),
IO.map(links => links.filter(v => v != null))
);
}
Now getTests
gives you back an IO of same elements that were in your tests
variable in original code.
Disclaimer: I haven't run the code through the compiler, it might have some typos or mistakes. You probably also need to put some effort to make it all strongly typed.
EDIT:
If you want to preserve information on the errors (in this case, missing data-test
attribute on one of the a
elements), you have several options to do so. Currently getTests
returns an IO<string[]>
. To fit error info there, you could do:
IO<Either<Error, string>[]>
- an IO that returns an array where each element is either error OR value. To work with it, you still need to do filtering later to get rid of the errors. This is the most flexible solution as you don't lose any information, but it feels kinda useless too because Either<Error, string>
is pretty much the same in this case as string | null
.import * as Either from 'fp-ts/lib/Either';
const attrIO = (...args) => element: IO<Either<Error, string>> => IO.of(Either.fromNullable(new Error("not found"))(element.attr(...args) ? element.attr(...args): null));
const getTests = (text: string): IO<Either<Error, string>[]> => {
const $ = cheerio.load(text);
return pipe(
$('table tr'),
getIO,
IO.chain(rows => array.traverse(io)(rows, flow($, findIO('a')))),
IO.chain(links => array.traverse(io)(links, attrIO('data-test')))
);
}
IOEither<Error, string[]>
- an IO that returns either an error OR an array of values. Here the most usual thing to do is to return Error when you get a first missing attribute, and return an array of values if all values are non-erroneous. So again, this solution loses info about correct values if there are any errors AND it loses info about all errors except the first one.import * as Either from 'fp-ts/lib/Either';
import * as IOEither from 'fp-ts/lib/IOEither';
const { ioEither } = IOEither;
const attrIOEither = (...args) => element: IOEither<Error, string> => IOEither.fromEither(Either.fromNullable(new Error("not found"))(element.attr(...args) ? element.attr(...args): null));
const getTests = (text: string): IOEither<Error, string[]> => {
const $ = cheerio.load(text);
return pipe(
$('table tr'),
getIO,
IO.chain(rows => array.traverse(io)(rows, flow($, findIO('a')))),
IOEither.rightIO, // "lift" IO to IOEither context
IOEither.chain(links => array.traverse(ioEither)(links, attrIOEither('data-test')))
);
}
IOEither<Error[], string[]>
- an IO that returns either an array of errors OR an array of values. This one aggregates the errors if there are any, and aggregates the values if there are no errors. This solution loses info about correct values if there are any errors. This approach is more rare in practice than the above ones and is more tricky to implement. One common use-case is validation check, and for that there is a monad transformer https://gcanti.github.io/fp-ts/modules/ValidationT.ts.html. I don't have much experience with it, so can't say more on this topic.
IO<{ errors: Error[], values: string[] }>
- an IO that returns an object containing both errors and values. This solution also doesn't lose any info, but is slightly more tricky to implement.The canonical way of doing it is to define a monoid instance for the result object { errors: Error[], values: string[] }
and then aggregate the results using foldMap
:
import { Monoid } from 'fp-ts/lib/Monoid';
type Result = { errors: Error[], values: string[] };
const resultMonoid: Monoid<Result> = {
empty: {
errors: [],
values: []
},
concat(a, b) {
return {
errors: [].concat(a.errors, b.errors),
values: [].concat(a.values, b.values)
};
}
};
const attrIO = (...args) => element: IO<Result> => {
const value = element.attr(...args);
if (value) {
return {
errors: [],
values: [value]
};
} else {
return {
errors: [new Error('not found')],
values: []
};
};
const getTests = (text: string): IO<Result> => {
const $ = cheerio.load(text);
return pipe(
$('table tr'),
getIO,
IO.chain(rows => array.traverse(io)(rows, flow($, findIO('a')))),
IO.chain(links => array.traverse(io)(links, attrIO('data-test'))),
IO.map(results => array.foldMap(resultMonoid)(results, x => x))
);
}
Upvotes: 8