Claim Missing Document
Check
Articles

Found 2 Documents
Search

Large language models-based metric for generative question answering systems Abdel Azim, Hazem; Tharwat Waheed, Mohamed; Mohammed, Ammar
IAES International Journal of Artificial Intelligence (IJ-AI) Vol 14, No 1: February 2025
Publisher : Institute of Advanced Engineering and Science

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.11591/ijai.v14.i1.pp151-158

Abstract

In the evolving landscape of text generation, which has advanced rapidly in recent years, techniques for evaluating the performance and quality of the generated text lag behind relatively. Traditionally, lexical-based metrics such as bilingual evaluation understudy (BLEU), recall-oriented understudy for gisting evaluation (ROUGE), metric for evaluation of translation with explicit ordering (METEOR), consensus-based image description evaluation (CIDER), and F1 have been utilized, primarily relying on n-gram similarity for evaluation. In recent years, neural and machine-learning-based metrics, like bidirectional encoder representations from transformers (BERT) score, key phrase question answering (KPQA), and BERT supervised training of learned evaluation metric for reading comprehension (LERC) have shown superior performance over traditional metrics but suffered from a lack of generalization towards different domains and requires massive human-labeled training data. The main contribution of the current research is to investigate the use of train-free large language models (LLMs) as scoring metrics, evaluators, and judges within a questionanswering context, encompassing both closed and open-QA scenarios. To validate this idea, we employ a simple zero-shot prompting of Mixtral 8x7 B, a popular and widely used open-source LLM, to score a variety of datasets and domains. The experimental results on ten different benchmark datasets are compared against human judgments, revealing that, on average, simple LLMbased metrics outperformed sophisticated state-of-the-art statistical and neural machine-learning-based metrics by 2-8 points on answer-pairs scoring tasks and up to 15 points on contrastive preferential tasks.
Cybercrimes Between Criminalization and Punishment Aziz, Turath; Mohammed, Ammar
Indonesian Journal of Law and Justice Vol. 3 No. 2 (2025): December
Publisher : Indonesian Journal Publisher

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.47134/ijlj.v3i2.5362

Abstract

This research covers the Iraqi experience regarding cybercrimes, which reveal an obvious discrepancy between the present criminal laws and the widespread nature of the breaches that are common nowadays. This research particularly covers the ongoing use of the Code of Criminal Procedure and the 1980 Wireless Communications Law, despite the reality that the laws were mainly drafted for physical crimes. As such, the legislation doesn't address the wide range of other crimes that are committed using the internet, such as hacking into computer systems, the manipulation of electronic data, and international fraud. In addition, the research focuses on the problems faced by the prosecution regarding the search for electronic documentation, which, as discussed, remains uncharted despite plenty of gaps. In short, the whole criminal justice system only delivers ineffectual penalties despite the gravity of the crime. This research forms the starting point for exploring the gaps that are apparent within the existing Iraqi legal framework. It assesses the applicability of the laws and regulations and attempts to pinpoint where the laws are failing and why there are discrepancies in enforcing them. The authors believe that the criminalization process needs to shift from the imbalance of the laws to the definition of digital behaviors and an accompanying shift in the definition of the evidentiary rules. The conclusion reached highlights that Iraqi legislation and regulations lack the efficacy required to deal with the comprehensiveness of the threats that come through cyberspace.