Reputation: 87
I have in-memory text, json format, and I am trying to load dataset (HuggingFace) directly from text in-memory.
If I will save it into file - I can load the dataset using huggingface load_dataset:
from datasets import load_dataset
dataset = load_dataset('json', data_files='my_file.json')
See also: https://huggingface.co/docs/datasets/v1.11.0/loading_datasets.html#from-local-files
Can I load the dataset directly from the in-memory text without saving it into file?
Upvotes: 0
Views: 164
Reputation: 3801
Build a dict from the json, then build the dataset object yourself:
import json
import datasets
the_json_string = "..." # you define this obviously
the_dict = json.loads(the_json_string) # loads builds a dict from a string
dataset_object = datasets.Dataset.from_dict(the_dict)
Look at the documentation for datasets.Dataset.from_dict
for exactly how to make this work:
Upvotes: 1