This research evaluates the performance of various models in the Named Entity Recognition (NER) task for medical entities, focusing on imbalanced datasets. Six BioBERT model configurations were tested, incorporating optimization techniques such as Class Weight, Conditional Random Fields (CRF), and Hyperparameter Tuning. The evaluation was conducted using Precision, Recall, and F1-Score metrics, which are particularly relevant in the context of NER, especially for addressing class imbalance in the data. The dataset used is BC5CDR, which targets chemical and disease entities in unstructured medical texts from PubMed. The data was divided into three parts: a training dataset for model training, a validation dataset for model tuning, and a test dataset for performance evaluation. The dataset was split evenly to ensure unbiased model testing, leading to more accurate results that can serve as a reference for developing more efficient medical NER systems. The evaluation results indicate that BioBERT + CRF is the model with an F1-Score that reflects an optimal balance between Precision (ranked 3rd, 0.6067 for B-Chemical, 0.5594 for B-Disease, 0.4600 for I-Disease, and 0.5083 for I-Chemical) and Recall (ranked 3rd, 0.5580 for B-Chemical, 0.4491 for B-Disease, 0.5718 for I-Disease, and 0.3840 for I-Chemical) compared to other models. This model proved to be more accurate in detecting medical entities without compromising prediction precision. The model's stability is also enhanced by a smaller gap between Precision and Recall, making it the best choice for NER in medical texts. The application of early stopping techniques effectively prevented overfitting, ensuring the model learned optimally without losing generalization. With better balance in recognizing medical entities from unstructured texts, this model presents the most effective approach for NER systems in the medical domain.