Skip to main content

Write a PREreview

A Comparative Analysis of Multilingual and Monolingual Models for Nepali Legal Document Retrieval

Posted
Server
Preprints.org
DOI
10.20944/preprints202606.0033.v1

While extensive research has been conducted oninformation retrieval for high-resource languages, the Nepali language, particularly the Nepali legal domain, remains underexplored. This study aims to address this gap by empiricallycomparing the performance of multilingual and monolingual open-source language models on a Nepali legal document retrieval task. We constructed a domain-specific dataset consisting of 10 Nepali legal documents. Additionally, 50 curated legal queries were created, with five derived from each document. We evaluated seven multilingual models selected based on their robust performance on the Massive Text Embedding Benchmark(MTEB), alongside three Nepali-specific monolingual models trained exclusively on the Nepali language. The models were evaluated using varying chunk sizes and standard information retrieval metrics, including Recall, Precision, and MeanReciprocal Rank (MRR). Experimental results demonstrate that the multilingual model BAAI/bge-m3 consistently outperforms the other evaluated models across all settings, achieving 0.92 Recall@6, 0.74 Precision@1, and 0.83 MRR@4. While multilingual models show strong retrieval effectiveness, the findings indicate that existing Nepali monolingual models remain less competitive and require substantial improvement for domain-specific legal retrieval tasks.

You can write a PREreview of A Comparative Analysis of Multilingual and Monolingual Models for Nepali Legal Document Retrieval. A PREreview is a review of a preprint and can vary from a few sentences to a lengthy report, similar to a journal-organized peer-review report.

Before you start

We will ask you to log in with your ORCID iD. If you don’t have an iD, you can create one.

What is an ORCID iD?

An ORCID iD is a unique identifier that distinguishes you from everyone with the same or similar name.

Start now