About 268,000 results
Open links in new tab
  1. QLoRA: Efficient Finetuning of Quantized LLMs - GitHub

    Jul 18, 2023 · We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task …

  2. qlora/README.md at main · artidoro/qlora · GitHub

    Jul 18, 2023 · We present QLoRA, an efficient finetuning approach that reduces memory usage enough to finetune a 65B parameter model on a single 48GB GPU while preserving full 16-bit finetuning task …

  3. Tutorial about LLM Finetuning with QLORA on a Single GPU

    This tutorial will guide you through the process of fine-tuning a Language Model (LLM) using the QLORA technique on a single GPU. We will be using the Hugging Face Transformers library, PyTorch, and …

  4. GitHub - georgesung/llm_qlora: Fine-tuning LLMs using QLoRA

    Fine-tuning LLMs using QLoRA. Contribute to georgesung/llm_qlora development by creating an account on GitHub.

  5. GitHub - yuhuixu1993/qa-lora: Official PyTorch implementation of QA …

    Official PyTorch implementation of QA-LoRA. Contribute to yuhuixu1993/qa-lora development by creating an account on GitHub.

  6. IR-QLoRA - GitHub

    However, existing methods cause the quantized LLM to severely degrade and even fail to benefit from the finetuning of LoRA. This paper proposes a novel IR-QLoRA for pushing quantized LLMs with …

  7. GitHub - AdityaSagarr/LLM-Fine-Tuning: This project showcases fine ...

    LoRA (Low-Rank Adaptation) and qLoRA (Quantized LoRA) provide efficient methods to adapt pre-trained models without overburdening system resources. This guide will walk you through the …

  8. mlx-examples/lora/README.md at main - GitHub

    This is an example of using MLX to fine-tune an LLM with low rank adaptation (LoRA) for a target task. 1 The example also supports quantized LoRA (QLoRA). 2 The example works with Llama and Mistral …

  9. litgpt/tutorials/finetune_lora.md at main · Lightning-AI/litgpt

    Finetuning with LoRA / QLoRA Low-rank adaption (LoRA) is a technique to approximate the update to the linear layers in a LLM with a low-rank matrix factorization.

  10. GPTQLoRA: Efficient Finetuning of Quantized LLMs with GPTQ

    Acknoledgements This code is based on QLoRA. This repo builds on the Stanford Alpaca and LMSYS FastChat repos.