Augmented reality (AR) enhances user experiences by overlaying digital information on real-world objects or places. Augmented reality makes unprecedentedly immersive experiences possible in marketing, industry, education, entertainment, fashion, and healthcare. While current augmented reality methods can identify 3D items in their environment, the recognition of tiny, complex objects remains a problem for most object detection methods. In addition, object detection is a key in computer vision and AR systems. The Object detection process aims to classify and localize objects in applications like face detection, text detection, and people counting. Many natural features detection models were proposed, like YOLO, YOLO-LITE, and YOLOv4-tiny. However, the detection of objects from natural images remains a challenging task, often compromising accuracy or requiring longer processing times. To overcome these challenges, this article suggests a novel method that combines the strengths of YOLO-LITE and YOLOv4-tiny into a hybrid model. The suggested model name is LITE-YOLOv4, which stands for “LITE-You Only Look Once Version 4. The model design depends on YOLO-LITE as a backbone. LITE-YOLOv4 uses a feature pyramid network to extract feature maps of various sizes. It also utilizes a "shallow and narrow" convolution layer to optimize its object detection capability. The proposed model aims to achieve a speed and accuracy balance, making it suitable for use in AR apps on portable devices and PCs without GPUs. LITE-YOLOv4 achieved a mean average precision (mAP) of 52.6% on the PASCAL VOC dataset and 33.3% on the COCO dataset. The suggested model achieved a respectable speed, which is 20 frames per second (FPS). LITE-YOLOv4 provides better accuracy and reasonable computational time than state-of-the-art non-GPU models.