BayJarvis: Blogs on tenyx

paper Characterizing Large Language Models Geometry for Toxicity Detection and Generation - 2024-03-18

Abstract: Large Language Models (LLMs) drive significant advancements in AI, yet understanding their internal workings remains a challenge. This paper introduces a novel geometric perspective to characterize LLMs, offering practical insights into their functionality. By analyzing the intrinsic dimension of Multi-Head Attention (MHA) embeddings and the affine mappings within layer feed-forward networks, we unlock new ways to manipulate and interpret LLMs. Our findings enable bypassing restrictions like RLHF in models such as Llama2, and we introduce seven interpretable spline features extracted from any LLM layer. These features, tested on models like Mistral-7B and Llama2, prove highly effective in toxicity detection, domain inference, and addressing the Jigsaw challenge, showcasing the practical utility of our geometric characterization. …

paper Scaling Laws for Forgetting When Fine-Tuning Large Language Models - 2024-03-16

When fine-tuning Large Language Models (LLMs) like GPT-3 or BERT for specific tasks, a common challenge encountered is "forgetting" – where the model loses some of its pre-trained capabilities. This phenomenon is particularly noticeable in Parameter-Efficient Fine-Tuning (PEFT) methods such as Low-Rank Adapters (LoRA). …

paper A Rank Stabilization Scaling Factor for Fine-Tuning with LoRA - 2024-03-14

Low-Rank Adapters (LoRA) have emerged as a popular parameter-efficient fine-tuning method for large language models. By adding trainable low-rank "adapters" to selected layers, LoRA enables effective fine-tuning while dramatically reducing the number of parameters that need to be trained. However, the conventional LoRA method uses a scaling factor for the adapters that divides them by the rank. A new paper by researcher Damjan Kalajdzievski shows that this rank-dependent scaling actually slows down learning and limits performance improvements when using higher-rank adapters. …