BayJarvis: Blogs on transformers

llm In Brief: Welcome Google's Gemma - New Open LLM - 2024-02-22

Google has just introduced Gemma, an innovative family of state-of-the-art open Large Language Models (LLMs), marking a significant stride in the open-source AI landscape. This release, featuring both 7B and 2B parameter models, underscores Google's ongoing commitment to open-source AI. The Hugging Face team is thrilled to support this launch, ensuring seamless integration within our ecosystem. …

llm Harnessing Zephyr's Breeze: DPO Training on Mistral-7B-GPTQ for Language Model Alignment - 2023-11-09

We've taken on the exciting challenge of implementing the cutting-edge strategies presented in "ZEPHYR: Direct Distillation of LM Alignment". This paper's approach is not just theoretical—it's a blueprint for a significant leap in language model training. By adopting ZEPHYR's distilled direct preference optimization (dDPO), we've embarked on a code journey that brings these innovations from concept to reality. …

llm Fine-tuning Zephyr 7B GPTQ with 4-Bit Quantization for Custom Data and Inference - 2023-11-08

Model fine-tuning and quantization play pivotal roles in creating efficient and robust machine learning solutions. This blog post explores the fine-tuning process of the Zephyr 7B GPT-Q model using 4-bit quantization to boost its performance for custom data inference tasks. …