The phenomenon of information overload in academic libraries often makes it difficult for users to discover relevant books, which may reduce reading interest. Conventional recommender systems are also prone to filter bubbles and tend to perform poorly under cold-start conditions. This study proposes a sequential recommendation system based on the Self-Attention Based Sequential Recommendation (SASRec) model integrated with five semantic embedding models, namely Word2Vec, BERT Multilingual, OpenAI text-embedding-3-small, Gemini-embedding-001, and Qwen3-Embedding-0.6B, to generate accurate and serendipitous recommendations. In addition, the Serendipity-Oriented Greedy (SOG) re-ranking algorithm is implemented to balance recommendation relevance and serendipity. The data set consists of 14,502 book records and 5,445 user interaction histories after the data cleaning process. Evaluation was conducted under three testing scenarios, namely the all-test set, warm test set, and cold test set, by comparing all model variants before and after the re-ranking process. The results show that the integration of Large Language Model (LLM)-based embeddings consistently improves performance compared to the standard SASRec model and traditional embeddings. Qwen3-Embedding-0.6B achieved the best performance, improving HitRate@10 by up to 282.9% and NDCG@10 by up to 387.8%, while maintaining semantic robustness in cold-start scenarios with an UnSerendipity@K score of 0.613. The implementation of SOG re-ranking reveals a direct trade-off between recommendation accuracy and diversity. Lightweight weighting provides the optimal balance, whereas overly aggressive weighting significantly reduces relevance. The main contribution of this study lies in integrating modern LLM embeddings into a sequential recommendation architecture to improve accuracy and cold-start robustness, while also evaluating the impact of serendipity-oriented re-ranking strategies on balancing recommendation relevance and diversity. Overall, this study demonstrates that modern LLM integration can produce a smarter, more adaptive, and more balanced library recommendation system in terms of both accuracy and serendipity.