carleton

Comprehensive API for Kurdish Text Processing Project

The project aims to build a comprehensive Application Programming Interface (API) for text processing written in Kurdish language (Arabic alphabet). The objectives of the project are: provide the capability to do file reading/writing, text segmentation, statistical analysis of text and perform information retrieval on texts written in Kurdish language. The project also aims to develop applications such as Kurdish spell checker, Kurdish thesaurus, and building domain ontologies for web services. If you interested in participating or sponsoring the project please contact me at armyunis@scs.carleton.ca


Title and Detailed description of individual Projects

  1. Kurdish Word Bank (project description, coming soon)
  2. Kurdish Spell Checker ( project description)
  3. Kurdish Thesauri (project description, coming soon)
  4. Common Words among Kurdish Dialects (project description, coming soon)
  5. Most Used Kurdish Word (project description, coming soon)


AWKurdi (Ambary Wshay Kurdi, read it as: A W kurdi, meaning The Kurdish Word Repository ). Email me (armyunis@scs.carleton.ca) to receive the first version of the AWKurdi (عەمباری ووشەی کوردی) that contains more than 600, 000 unique and sorted Kurdish words.


Well-Knowing Stemming Alorithms


ج
1580
ت
1578
پ
1662
ب
1576
ا
1575

1574
ڕ
1685
ر
1585
د
1583
خ
1582
ح
1581
چ
1670
غ
1594
ع
1593
ش
1588
س
1587
ژ
1688
ز
1586
ل
1604
گ
1711
ك
1603
ق
1602
ڤ
1700
ف
1601
و
1608
هـ
1607
ه
1749
ن
1606
م
1605
ڵ
1717
ێ
1742
ی
1740
ڵا
لا
وو
ۆ
1734
Table of Kurdish letters and their corresponding decimal values in UTF-8 coding when standing by itself