Indonesian social media platforms, particularly X (formerly Twitter), generate short and highly informal texts that contain linguistic cues useful for demographic inference. Given the scarcity of controlled comparative studies in Indonesian gender prediction, especially with modest datasets, this research evaluates the performance of Multinomial Naïve Bayes, Linear Support Vector Machine (SVM), and Bidirectional Long Short-Term Memory (BiLSTM) using a balanced corpus of 478 manually labeled Indonesian-language tweets. These three models were selected to represent classical probabilistic learning, margin-based linear classification, and neural sequence modeling, thereby enabling a methodologically coherent comparison across distinct algorithmic paradigms. The study implemented a unified workflow consisting of manual labeling, structured preprocessing with Sastrawi stemming, RandomOverSampler for class balancing, TF-IDF features for classical models, and sequence-based tokenization for BiLSTM. All models were trained and evaluated using a stratified 80:20 split. Experimental results show that Linear SVM achieved the strongest performance, reaching 0.833 accuracy and 0.832 macro-F1, surpassing Naïve Bayes (0.771 accuracy) and BiLSTM (0.740 accuracy). SVM also demonstrated the most stable confusion-matrix distribution and superior AUC characteristics, while BiLSTM exhibited fluctuating validation curves, indicating sensitivity to the limited dataset size. These findings reinforce that classical models—particularly Linear SVM remain highly competitive for Indonesian short-text gender classification in low-resource environments and offer practical advantages where computational constraints and data scarcity are prominent. Although the dataset is topically narrow and limited in scale, the results highlight the need for larger corpora or transformer-based Indonesian models to further enhance generalizability and downstream demographic inference.