LLM Quantization - Search News

Morning Overview on MSN

LLMs have tons of parameters, but what is a parameter?

Large language models are routinely described in terms of their size, with figures like 7 billion or 70 billion parameters ...

4don MSN

Nvidia plans DGX Spark software update to turn it into a local AI system

At CES 2026, Nvidia revealed it is planning a software update for DGX Spark which will significantly extend the device's ...

Semiconductor Engineering

Study Of HW Acceleration for Neural Networks (Arizona State Univ.)

A new technical paper titled “Hardware Acceleration for Neural Networks: A Comprehensive Survey” was published by researchers ...

EurekAlert!

Beyond bigger models: How efficient multimodal AI is redefining the future of intelligence

Multimodal large language models have shown powerful abilities to understand and reason across text and images, but their ...

Microsoft

TENET: An Efficient Sparsity-Aware LUT-Centric Architecture for Ternary LLM Inference On Edge

Ternary quantization has emerged as a powerful technique for reducing both computational and memory footprint of large language models (LLM), enabling efficient real-time inference deployment without ...

IEEE

SearchQ: Search-Based Fine-Grained Quantization for Data-Free Model Compression

Abstract: The huge memory and computing costs of deep neural networks (DNNs) greatly hinder their deployment on resource-constrained devices with high efficiency. Quantization has emerged as an ...

GitHub

ZiPinG23/iot-llm-arxiv-daily

2025-12-18 Staggered Batch Scheduling: Co-optimizing Time-to-First-Token and Throughput for High-Efficiency LLM Inference Jian Tian et.al. 2512.16134 null 2025-12-12 Adaptive Soft Rolling KV Freeze ...

IEEE

The Crossroads of LLM and Traffic Control: A Study on Large Language Models in Adaptive Traffic Signal Control

Abstract: Recent advancements in Large Language Models (LLMs) have ushered in opportunities to craft agents that exhibit human-like cognitive abilities, notably reasoning and planning. Leveraging the ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results