vanhooser
vanhooser

Reputation: 1747

How can I read and write column descriptions and typeclasses in foundry transforms?

I want to read the column descriptions and typeclasses from my upstream datasets, then I want to simply pass them through to my downstream datasets.

How can I do this in Python Transforms?

Upvotes: 1

Views: 495

Answers (1)

vanhooser
vanhooser

Reputation: 1747

If you upgrade your repository to at least 1.206.0, you'll be able to access a new feature inside the Transforms Python API: read and write of descriptions and typeclasses. For visibility, this question is also highly related to this one

The column_descriptions property gives back a structured Dict<str, List<Dict<str, str>>>, for example a column of tags will have a column_typeclasses object of {'tags': [{"name": "my_name", "kind": "my_kind"}]}. A typeclass always consists of two components, a name, and a kind, which is present in every dictionary of the list shown above. It is the only two keys possible to pass in this dict, and the corresponding values for each key must be str.

Full documentation is in the works for this feature, so stay tuned.

from transforms.api import transform, Input, Output


@transform(
    my_output=Output("ri.foundry.main.dataset.my-output-dataset"),
    my_input=Input("ri.foundry.main.dataset.my-input-dataset"),
)
def my_compute_function(my_input, my_output):
    recent = my_input.dataframe().limit(10)

    existing_typeclasses = my_input.column_typeclasses
    existing_descriptions = my_input.column_descriptions

    my_output.write_dataframe(
        recent,
        column_descriptions=existing_descriptions,
        column_typeclasses=existing_typeclasses
    )

Upvotes: 2

Related Questions