This study aims to develop an evaluation instrument based on the Four Tier Diagnostic Test on Qawā’id Balāgah material by utilizing an information and communication technology (ICT) platform to diagnose students' understanding in depth. This study uses a research and development (R&D) method based on the Borg & Gall model, which consists of ten steps. Data collection was carried out through tests and questionnaires. The test consists of 25 multiple-choice questions based on the Four Tier Diagnostic Test with a total of 100 questions divided into four levels: the first level measures cognitive ability, the second level determines the level of confidence in the answer, the third level explores the reasons behind the answer, and the fourth level measures confidence in the reasoning used. The validation results by material experts showed that this instrument was valid with a percentage of 79.5%, although some aspects such as the suitability of the questions to student competencies only reached 70%. The depth of the material and the ability of the questions to measure misconceptions scored 80%. Validation by media experts showed that this tool was valid with an average score of 82.5%. The validity test of the question items showed that all items were valid with a significance value of <0.05 and R-count> R-table (0.361). The reliability of the instrument based on Cronbach's Alpha of 0.884 indicates that this instrument is very reliable. Analysis of the level of difficulty shows that the questions are in the moderate category (0.3-0.73), with good discriminatory power (0.33-0.73). Thus, this instrument is declared feasible to be used as an evaluation tool in Balaghah learning and is effective in diagnosing misconceptions and students' level of understanding.