About 1,240,000 results
Open links in new tab
  1. Qwen-VL: A Versatile Vision-Language Model for Understanding ...

    Sep 19, 2023 · In this work, we introduce the Qwen-VL series, a set of large-scale vision-language models (LVLMs) designed to perceive and understand both texts and images. …

  2. Gated Attention for Large Language Models: Non-linearity, …

    Sep 18, 2025 · The authors response that they will add experiments in QWen architecture, give the hyperparameters, and promise to open-source one of the models. Reviewer bMKL is the …

  3. Rank-1 LoRAs Encode Interpretable Reasoning Signals

    Sep 29, 2025 · Specifically, we use a rank-1 LoRA to create a minimal parameter adapter for \texttt {Qwen-2.5-32B-Instruct} which recovers 73-90% of reasoning-benchmark performance …

  4. LLaVA-MoD: Making LLaVA Tiny via MoE-Knowledge Distillation

    Jan 22, 2025 · Superior Performance: LLaVA-MoD surpasses larger models like Qwen-VLChat-7B in various benchmarks, demonstrating the effectiveness of its knowledge distillation approach.

  5. MagicDec: Breaking the Latency-Throughput Tradeoff for Long …

    Jan 22, 2025 · (a) Summary of Scientific Claims and Findings The paper presents MagicDec, a speculative decoding technique aimed at improving throughput and reducing latency for long …

  6. In this paper, we explore a way out and present the newest members of the open-sourced Qwen fam-ilies: Qwen-VL series. Qwen-VLs are a series of highly performant and versatile vision …

  7. Towards Federated RLHF with Aggregated Client Preference for …

    Jan 22, 2025 · For example, our experiments demonstrate that the Qwen-2-0.5B selector provides strong performance enhancements to larger base models like Gemma-2B while ensuring …

  8. Towards Interpretable Time Series Foundation Models - OpenReview

    Jun 9, 2025 · Leveraging a synthetic dataset of mean-reverting time series with systematically varied trends and noise levels, we generate natural language annotations using a large …

  9. ADIFF: Explaining audio difference using natural language

    Jan 22, 2025 · We evaluate our model using objective metrics and human evaluation and show our model enhancements lead to significant improvements in performance over naive baseline …

  10. Towards Understanding Distilled Reasoning Models: A...

    Mar 5, 2025 · To explore this, we train a crosscoder on Qwen-series models and their fine-tuned variants. Our results suggest that the crosscoder learns features corresponding to various …