Siamese networks for multilingual classified ad matching

Loading...
Thumbnail Image

Date

2025

Journal Title

Journal ISSN

Volume Title

Publisher

IEEE

Abstract

This paper presents a novel approach to semantically match ”Resource Wanted” and ”Resource Offering” classified ads within Sri Lanka’s complex multilingual digital marketplace. We introduce a Siamese neural network architecture specifically designed to effectively process both textual content and categorical metadata across English and Sinhala languages. Our model leverages advanced multilingual transformer models to create semantically rich embeddings, with a LaBSEbased implementation achieving superior performance, reaching a Recall@1 of 0.5813 and a Recall@10 of 0.9151. Crucially, the integration of categorical features with text embeddings yielded the best results, demonstrating a 1.5% improvement in Recall@1over the text-only approach. Our methodology addresses the significant challenge of matching ads across linguistic boundaries in a low-resource setting, providing a method that can significantly improve transaction efficiency in Sri Lanka’s diverse digital marketplace.

Description

Citation

DOI

Collections

Endorsement

Review

Supplemented By

Referenced By