Reputation: 1
The device is NVIDIA Jetson Orin 32GB, arm64,
I used miniconda to create a virtual environment called spark
, the torch version 2.1.0, cuda version 12.2.
My Pytorch is installed from Nvidia forum, in jetpack6, for aarch64, namely this wheel. On Orin, I ran pip install torch-2.1.0-cp310-cp310-linux_aarch64.whl
to install it.
My ~/.bashrc is like:
# >>> conda initialize >>>
# !! Contents within this block are managed by 'conda init' !!
__conda_setup="$('/home/lyy/miniconda3/bin/conda' 'shell.bash' 'hook' 2> /dev/null)"
if [ $? -eq 0 ]; then
eval "$__conda_setup"
else
if [ -f "/home/lyy/miniconda3/etc/profile.d/conda.sh" ]; then
. "/home/lyy/miniconda3/etc/profile.d/conda.sh"
else
export PATH="/home/lyy/miniconda3/bin:$PATH"
fi
fi
unset __conda_setup
export PATH=/usr/local/cuda-12.2/bin:$PATH
export LD_LIBRARY_PATH=/usr/local/cuda-12.2/lib64:$LD_LIBRARY_PATH
export C_INCLUDE_PATH=/usr/local/cuda-12.2/include${C_INCLUDE_PATH:+:${C_INCLUDE_PATH}}
export CPLUS_INCLUDE_PATH=/usr/local/cuda-12.2/include${CPLUS_INCLUDE_PATH:+:${CPLUS_INCLUDE_PATH}}
# <<< conda initialize <<<
I'm trying to run the repo ArchieGertsman/spark-sched-sim, and I just followed the README file to install the required packages. I successfully launched the examples interference part python examples.py --sched [fair|decima]
using the model provided, but I failed to run the training part python train.py -f config/decima_tpch.yaml
. The yaml file was not edited after cloning from Github.
Here is the error message:
(base) lyy@sail-orin0:~$ conda activate spark
(spark) lyy@sail-orin0:~$ cd spark-sched-sim/
(spark) lyy@sail-orin0:~/spark-sched-sim$ python train.py -f config/decima_tpch.yaml
Traceback (most recent call last):
File "/home/lyy/spark-sched-sim/train.py", line 2, in <module>
from trainers import make_trainer
File "/home/lyy/spark-sched-sim/trainers/__init__.py", line 3, in <module>
from .vpg import VPG
File "/home/lyy/spark-sched-sim/trainers/vpg.py", line 5, in <module>
from .trainer import Trainer
File "/home/lyy/spark-sched-sim/trainers/trainer.py", line 17, in <module>
from schedulers import make_scheduler, TrainableScheduler
File "/home/lyy/spark-sched-sim/schedulers/__init__.py", line 13, in <module>
from .decima import DecimaScheduler
File "/home/lyy/spark-sched-sim/schedulers/decima/__init__.py", line 3, in <module>
from .scheduler import DecimaScheduler
File "/home/lyy/spark-sched-sim/schedulers/decima/scheduler.py", line 7, in <module>
from torch_scatter import segment_csr
File "/home/lyy/miniconda3/envs/spark/lib/python3.10/site-packages/torch_scatter/__init__.py", line 16, in <module>
torch.ops.load_library(spec.origin)
File "/home/lyy/miniconda3/envs/spark/lib/python3.10/site-packages/torch/_ops.py", line 852, in load_library
ctypes.CDLL(path)
File "/home/lyy/miniconda3/envs/spark/lib/python3.10/ctypes/__init__.py", line 374, in __init__
self._handle = _dlopen(self._name, mode)
OSError: /home/lyy/miniconda3/envs/spark/lib/python3.10/site-packages/torch_scatter/_version_cpu.so: undefined symbol: _ZN5torch3jit17parseSchemaOrNameERKSs
I noticed some similar question on Stack Overflow, but following the provided procedures in the answer didn't help solve the error.
Upvotes: 0
Views: 48