Reputation: 488
I would like to load a LightGBM model from a string or buffer rather than a file on disk.
It seems that there is a method called model_from_string
documentation link but ... it produces an error, which seemingly defeats the purpose of the method as I understand it.
import boto3
import lightgbm as lgb
import io
model_path = 'some/path/here'
s3_bucket = boto3.resource('s3').Bucket('some-bucket')
obj = s3_bucket.Object(model_path)
buf = io.BytesIO()
try:
obj.download_fileobj(buf)
except Exception as e:
raise e
else:
model = lgb.Booster().model_from_string(buf.read().decode("UTF-8"))
which produces the following error....
TypeError: Need at least one training dataset or model file to create booster instance
Alternatively, I thought that I might be able to use the regular loading method
lgb.Booster(model_file=buf.read().decode("UTF-8"))
... but this also doesn't work.
FileNotFoundError: [Errno 2] No such file or directory: ''
Now, I realize that I can create a workaround by writing the buffer to disk, and then reading it. However, this feels very redundant and inefficient.
Thus, my question is, how can instantiate a model to use for prediction without pointing to a an actual file on disk?
Upvotes: 0
Views: 5032
Reputation: 488
It seems that there is an undocumented parameter model_str
which can be used to initialize the lgb.Booster
object.
model = lgb.Booster({'model_str': buf.read().decode("UTF-8")})
Source: https://github.com/Microsoft/LightGBM/issues/2097#issuecomment-482332232
Credit goes to Nikita Titov aka StrikerRUS on GitHub.
Upvotes: 1