Source: llama.cpp
Section: science
Priority: optional
Maintainer: Debian Deep Learning Team <debian-ai@lists.debian.org>
Uploaders: Christian Kastner <ckk@debian.org>
Standards-Version: 4.7.2
Vcs-Browser: https://salsa.debian.org/deeplearning-team/llama.cpp
Vcs-Git: https://salsa.debian.org/deeplearning-team/llama.cpp.git
Homepage: https://github.com/ggml-org/llama.cpp/
# We could B-D on libc6 (>= 2.33) to ensure support for Hardware Capabilities,
# but with our install layout, a lack of support means that the baseline
# version will be used in such a case.
Build-Depends: cmake,
               debhelper-compat (= 13),
               libcurl4-openssl-dev,
               libggml-cpu,
               pkgconf,
Rules-Requires-Root: no

Package: llama.cpp
Architecture: any
Multi-Arch: foreign
Depends: libggml-cpu | libggml-backend,
         python3,
         ${misc:Depends},
         ${shlibs:Depends},
Description: LLM inference in C/C++
 The main goal of llama.cpp is to enable LLM inference with minimal setup and
 state-of-the-art performance on a wide range of hardware - locally and in the
 cloud.
 .
  * Plain C/C++ implementation without any dependencies
  * Apple silicon is a first-class citizen - optimized via ARM NEON, Accelerate
    and Metal frameworks
  * AVX, AVX2, AVX512 and AMX support for x86 architectures
  * 1.5-bit, 2-bit, 3-bit, 4-bit, 5-bit, 6-bit, and 8-bit integer quantization
    for faster inference and reduced memory use
  * Custom CUDA kernels for running LLMs on NVIDIA GPUs (support for AMD GPUs
    via HIP and Moore Threads MTT GPUs via MUSA)
  * Vulkan and SYCL backend support
  * CPU+GPU hybrid inference to partially accelerate models larger than the
    total VRAM capacity
 .
 The compute functionality is provided by ggml. By default, ggml's CPU backend
 is installed, but there are many other backends for CPUs and GPUs.