This study presents a browser extension that detects harmful content on both web pages and TikTok using a deep learning-based approach. The core model employs a Bidirectional Long Short-Term Memory (BiLSTM) network for multi-label classification, targeting six categories: Toxic, Severe Toxic, Obscene, Threat, Insult, and Identity Hate. The dataset combines 13,057 labeled samples from a public Kaggle dataset (2021) and 2,884 manually labeled tweets scraped from Twitter (X) between October–November 2024. Three feature extraction methods were tested: learned embeddings, FastText, and Word2Vec. The BiLSTM model architecture includes one embedding layer, a 32-unit bidirectional LSTM, three dense layers (128,256,128) using ReLU activation, and a six-unit sigmoid output layer. The model was trained using the Adam optimizer and binary cross-entropy loss, with early stopping applied after five stagnant validation checks across a maximum of 200 epochs. While the FastText-based model showed the best performance, the final deployed model used learned embeddings in Scenario 1 due to its smaller size (1.6M parameters) and near-optimal performance (Recall: 0.9786; Hamming Loss: 0.0052). The extension also integrates Whisper ASR for detecting harmful speech in video-based platforms like TikTok and supports five customizable censorship filters. User evaluation via Customer Satisfaction Score (CSAT) indicated strong acceptance, with 95.45% rating the user experience as Excellent, 84.09% confirming detection relevance, and 79.55% rating the system performance as Good. This highlights the extension’s effectiveness in promoting safer digital interaction across text and audiovisual content.