Recognizing Arabic offline handwritten words still faces various challenges because of the diversity of writing styles and the overlap between the words and characters. Therefore, building an effective system to solve these challenges has always been difficult, which has led to a lack of published research in this field. This study introduces two new models to recognize handwritten Arabic words based on the Faster region-convolution neural network (Faster R-CNN). These models employ two pre-trained networks during the feature extraction phase: The visual geometry group-16 (VGG-16) network and the residual network (ResNet50) network. To help with overlapping detections and make localization more accurate, a soft non-maximum suppression (Soft-NMS) strategy is used in post-processing. Models are independently trained and tested on two groups of data from the Institut Für Nachrichtentechnik/Ecole Nationale d’Ingénieurs de Tunis (IFN/ENIT) dataset. The first group includes one word in each image, while the second contains multiple words. Test results showed that the proposed models give excellent results compared to others. The results of VGG16 and ResNet50 with the first dataset reached accuracy rates of 100% and 99.5%, respectively. Meanwhile, the accuracy of the second group reached 91.4% and 100% with VGG16 and ResNet50, respectively.