Reputation: 4811
I am using RankLib for my data (shape: 218279 rows × 1504 columns) using python and getting error code 1 with none output. I am just wondering is there any documentation regarding error codes on RankLib?
I am using Jupyter iPython for my project and run the process using subprocess.run. In case you are wondering, below is my code to train.
train_data = 'learning_to_rank_data/training.txt'
test_data = ''
validate_data = ''
metric2t = 'NDCG@2'
model_dest = 'learning_to_rank_data/model.txt'
try:
subprocess.run(['java', '-jar', ranklibjar, '-train', train_data, '-ranker', '3', '-metric2t', metric2t, '-save', model_dest], shell=True, check=True)
except subprocess.CalledProcessError as e:
raise RuntimeError("command '{}' return with error (code {}): {}".format(e.cmd, e.returncode, e.output))
Below is the output:
RuntimeError: command '['java', '-jar', 'RankLib-2.9.jar', '-train', 'learning_to_rank_data/training.txt', '-ranker', '3', '-metric2t', 'NDCG@2', '-save', 'learning_to_rank_data/model.txt']' return with error (code 1): None
I have tried running the RankLib library (i.e. java -jar bin/RankLib.jar
) in the Jupyter using same approach (subprocess.run
) and it works fine (i.e. return code 0).
What is causing this error code 1? Is it possible because of my data is too big? Or is it because I only conduct training not with testing and validation?
Any help would be appreciated!
EDIT
I just tried sliced my data to 1000 rows and still have return code 1 issue, so the big data is not an issue. What is exactly causing this problem?
Upvotes: 0
Views: 525
Reputation: 4811
This problem is solved. Apparently, the minimum value of relevance ranking data for list-wise approach is 1 and not 0. Initially I thought 0 would mean the data is not relevant at all.
Upvotes: 0