Sri Lanka Achieves AI Milestone with Release of First Open-Source Trilingual Large Language Model

Colombo, Sri Lanka — Sri Lanka’s artificial intelligence sector has reached a significant milestone with the official open-source release of Chat2Find-Instruct-v1, the country’s first publicly available trilingual large language model (LLM) designed specifically for Sinhala, Tamil, and English.

The release marks a major step forward in the development of sovereign AI capabilities in Sri Lanka, providing developers, researchers, businesses, and public institutions with access to an advanced language model built around the linguistic and cultural realities of the country.

Unlike many global AI models that often struggle with localized terminology, regional entities, and cultural context, Chat2Find-Instruct-v1 has been engineered to better understand and process the unique language patterns used daily by millions of Sri Lankans. The model aims to bridge gaps in AI accessibility by delivering high-quality language understanding and generation across all three national languages.

Built upon the Chat2Find pre-trained foundation model, Chat2Find-Instruct-v1 introduces advanced reasoning capabilities and agentic AI functionality, enabling it to perform complex tasks that extend beyond conventional text generation.

One of its key features is Deep Chain-of-Thought (CoT) Reasoning, allowing the model to perform multi-step logical analysis, mathematical problem solving, and structured reasoning processes. This capability enhances the model’s ability to tackle sophisticated user queries and workflows.

The model also includes native support for agentic tool use and function calling, enabling integration with external tools, web search systems, APIs, and automated workflows. This functionality allows developers to build AI-powered applications capable of performing actions and retrieving information in Sinhala, Tamil, and English environments.

According to the developers, Chat2Find-Instruct-v1 was trained using one of the largest trilingual datasets assembled in Sri Lanka. The training corpus consists of approximately 1.38 gigabytes of curated trilingual text containing over 255 million words, while the instruction-tuning process utilized more than 279,000 conversational instruction-response pairs to improve alignment, responsiveness, and multilingual performance.

The release follows a growing global trend toward open-source AI development, where countries and organizations are creating localized language models tailored to their own linguistic and cultural requirements.

In a move aimed at encouraging innovation and adoption, the developers have made both Chat2Find-CPT, the base pre-trained model, and Chat2Find-Instruct-v1 available under an open-source, open-weight framework. This allows developers to download, modify, fine-tune, and integrate the models into applications using standard machine learning tools and frameworks.

The release is expected to support a wide range of use cases, including education, legal technology, government services, customer support, content creation, research, and enterprise automation. It may also help accelerate the development of AI applications that serve Sri Lanka’s multilingual population more effectively than generic international models.

Industry observers view the launch as an important development for Sri Lanka’s emerging AI ecosystem, positioning the country among a growing number of nations investing in localized language technologies and digital sovereignty.

The model and associated datasets are available through the Hugging Face platform, while additional information and technical documentation can be accessed through the Chat2Find project website.

With the launch of Chat2Find-Instruct-v1, Sri Lanka takes a significant step toward building an inclusive and locally relevant artificial intelligence ecosystem, laying the foundation for the next generation of trilingual AI applications and services.

Post Views: 9

Sri Lanka Achieves AI Milestone with Release of First Open-Source Trilingual Large Language Model

Sri Lanka Achieves AI Milestone with Release of First Open-Source Trilingual Large Language Model

Sri Lanka Yet to Fully Enact Several Key International Maritime Conventions

Sri Lanka’s AI Industry Reaches Major Milestone With Release of Chat2Find LLM

The Growing Backlash Against Dialog Axiata PLC

AI Research Warns Sri Lanka Is Becoming South Asia’s Most Fragile Environment for Startup Companies

Lanka Data Net (LDN) Releases First Version of Intelligent Layer for Public Data Access

Related Posts