Perezhohin, Yuriy
Unknown Affiliation

Published : 2 Documents Claim Missing Document
Claim Missing Document
Check
Articles

Found 2 Documents
Search
Journal : Emerging Science Journal

Zero-Shot Prompting Strategies for Table Question Answering with a Low-Resource Language Jannuzzi, Marcelo; Perezhohin, Yuriy; Peres, Fernando; Castelli, Mauro; Popovič, Aleš
Emerging Science Journal Vol 8, No 5 (2024): October
Publisher : Ital Publication

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.28991/ESJ-2024-08-05-020

Abstract

This work explores the application of zero-shot prompting strategies for table question answering (TQA) in Portuguese, focusing specifically on the Text2SQL task. This task involves translating questions posed in natural language into Structured Query Language (SQL) queries, which can be executed against a database to answer the original question. Given the popularity of relational databases across various domains, advancements in this field can substantially impact the accessibility and democratization of data as simpler and more intuitive interfaces for database interaction are developed. Despite this significant potential, progress in developing Portuguese TQA solutions remains limited. The proposed approach leverages Large Language Models (LLMs)—specifically the GPT-3.5 and GPT-4 models—through zero-shot prompting. The primary objectives are to assess the effectiveness of such LLMs in this task and to identify the most suitable prompt styles. These are evaluated using a Portuguese translation of the popular Spider Text2SQL benchmark. Results reveal that the proposed approach can generate adequate SQL queries to answer Portuguese language questions about various databases, mainly when using GPT-4. The findings suggest that including schema information and database content in the prompts is critical for satisfactory outcomes. Doi: 10.28991/ESJ-2024-08-05-020 Full Text: PDF
Retrieval-Augmented Generation Assistant for Anatomical Pathology Laboratories Pires, Diogo; Perezhohin, Yuriy; Castelli, Mauro
Emerging Science Journal Vol. 9 No. 6 (2025): December
Publisher : Ital Publication

Show Abstract | Download Original | Original Source | Check in Google Scholar | DOI: 10.28991/ESJ-2025-09-06-013

Abstract

Accurate and efficient access to laboratory protocols is essential in Anatomical Pathology (AP), where up to 70% of medical decisions depend on laboratory diagnoses. However, static documentation such as printed manuals or PDFs is often outdated, fragmented, and difficult to search, creating risks of workflow errors and diagnostic delays. This study proposes and evaluates a Retrieval-Augmented Generation (RAG) assistant tailored to AP laboratories, designed to provide technicians with context-grounded answers to protocol-related queries. We curated a novel corpus of 99 AP protocols from a Portuguese healthcare institution and constructed 323 question-answer pairs for systematic evaluation. Ten experiments were conducted, varying chunking strategies, retrieval methods, and embedding models. Performance was assessed using the RAGAS framework (faithfulness, answer relevance, context recall) alongside top-k retrieval metrics. Results show that recursive chunking and hybrid retrieval delivered the strongest baseline performance. Incorporating a biomedical-specific embedding model (MedEmbed) further improved answer relevance (0.74), faithfulness (0.70), and context recall (0.77), showing the importance of domain-specialized embeddings. Top-k analysis revealed that retrieving a single top-ranked chunk (k=1) maximized efficiency and accuracy, reflecting the modular structure of AP protocols. These findings highlight critical design considerations for deploying RAG systems in healthcare and demonstrate their potential to transform static documentation into dynamic, reliable knowledge assistants, thus improving laboratory workflow efficiency and supporting patient safety.