Reputation: 31
When I try installing Llam.cpp, I get the following error:
ld: warning: ignoring file '/Users/krishparikh/Projects/LLM/llama.cpp/ggml/src/ggml-metal-embed.o': found architecture 'x86_64', required architecture 'arm64'
Undefined symbols for architecture arm64:
"_ggml_metallib_end", referenced from:
_ggml_metal_init in ggml-metal.o
"_ggml_metallib_start", referenced from:
_ggml_metal_init in ggml-metal.o
ld: symbol(s) not found for architecture arm64
clang: error: linker command failed with exit code 1 (use -v to see invocation)
make: *** [llama-baby-llama] Error 1
I do not know what this is being caused by and if there is something i need to do regarding installation. The steps I followed were:
I tried going into the make file and adding -arch arm64 but the same error occurs:
MK_CPPFLAGS = -Iggml/include -Iggml/src -Iinclude -Isrc -Icommon
MK_CFLAGS = -std=c11 -fPIC -arch arm64
MK_CXXFLAGS = -std=c++11 -fPIC -arch arm64
MK_NVCCFLAGS = -std=c++11
Upvotes: 1
Views: 482
Reputation: 369
To run llama.cpp on a MAC M1:
Download the binaries from the releases: https://github.com/ggerganov/llama.cpp/releases Choose the one corresponding to the Mac architecture: llama-b3490-bin-macos-arm64.zip
Extract it, and a folder named /bin with many files appears, each being an executable. Among them:
├── llama-cli
├── llama-server
Once you have these executables, you need to download a model for the server to load and perform inference. The model must be in the GGUF format for llama.cpp to load it. Here, it is explained what this format is, and how and where to download it: https://huggingface.co/TheBloke/Llama-2-7B-Chat-GGUF
Download the file with this quantization: llama-2-7b-chat.Q3_K_L.gguf If you have enough HD, you can also download the file: llama-2-7b-chat.Q6_K.gguf which has a different quantization and provides better quality. There is a table in the previous link detailing the quality vs. size of each quantization, and it recommends which one to use.
Move it to a directory ./models (parallel to the /bin of llama.cpp)
Execute:
./llama-cli -m ../models/llama-2-7b-chat.Q3_K_L.gguf -p "I believe Real Madrid will win the Champions League again this year because" -n 128
Upvotes: -3