Open Language Profiles (OLP)

Open Language Profiles (OLP) is a platform for sharing open linguistic resources for language education, or “Universal Dependencies for Language Education.”

Open Language Profiles

Linguistic resources (such as CEFR-aligned vocabulary lists and annotated text) play an important role in language education. However, research initiatives for developing, sharing, and utilizing such resources are distributed, independent, and not coordinated with each other. Linguistic resources developed by such initiatives follow different standards and formats, and are often distributed only under restrictive licenses.

The goal of Open Language Profiles (OLP) is to promote open, multilingual, multidisciplinary research on language education by:

  • Defining a language-independent schema and format for distributing linguistic resources for education in order to promote interoperability
  • Sharing multilingual resources including, but are not limited to:
    • Vocabulary and grammar profiles (word and grammatical concept lists annotated with CEFR levels)
    • Texts annotated with CEFR levels
    • Text annotation software tools
  • Distributing such linguistic resources under permitting licenses (e.g., Creative Commons)


Currently, CEFR-j English Vocabulary and Grammar Profile datasets and Mandarin Chinese dataset from Zero to Hero are distributed by OLP. See the Datasets page for more information.

Contact us

Open Language Profiles is a joint research initiative led by Octanove Labs and RIKEN AIP.

If you are interested in providing data, partnering with, and/or sponsoring us, contact us.


Logo CEFR-j Logo Octanove Labs Logo RIKEN AIP Logo Chinese Zero to Hero