Human DNA sequence classification is a fundamental task in genomics, essential for understanding genetic variations and its implications in disease susceptibility, personalized medicine, and evolutionary biology. This study proposes a novel hybrid model combining Convolutional Neural Networks (CNN) for feature extraction and Random Forest classifiers for final classification. The model was evaluated on a dataset of human DNA sequences, with achieving an accuracy of 75.34%. The results showed that performance metrics, including precision, recall, and F1-scores across multiple classes, showed significant improvements over traditional models. The CNN component effectively captures local dependencies and patterns within the sequences, while the Random Forest classifier handles complex decision boundaries, resulting in enhanced classification accuracy. Comparative analysis demonstrated the superiority of our hybrid approach, with the CNN-LSTM model achieving only 59.47% accuracy, and other RNN-based models like CNN-GRU and CNN-BiLSTM performing similarly lower. These results suggest that hybrid models can leverage the strengths of both deep learning and traditional machine learning techniques an offering a more effective tool for DNA sequence classification. The future work will optimize model architecture and explore larger, thus more diverse datasets to validate our approach's generalizability and robustness.
Copyrights © 2024