Large language models (LLMs) are cornerstone technologies in AI, driving advancements across various fields. However, the traditional approach of re-training LLMs with every new data set is both costly and computationally inefficient. This paper presents a novel approach, focusing on continual pre-training, which allows for the incremental updating of LLMs without the need for full re-training, significantly saving computational resources. …