Vocab Profiler

This tool profiles any text according to word frequency. To use the tool, copy and paste a text into the “Input Text” area or type a text directly into the area. Then, click “Start” to create the profile, where each word is categorized according to its frequency level. In a separate pane, you will see an explanation of the results. The “Level”/“Frequency band” columns relate to the number of times a word appears in the 10-million-word CorCenCC corpus. Words in the “K1” (Top 1000) band are the 1000 most commonly used words in Welsh, according to CorCenCC. Typically, the more words the text has in the lower frequency bands (e.g. those in 3001-4000, 4001-5000 and >5001), the more challenging it will be for the learner. Note words in the 5001+ band may include misspelled words, words from other languages and words not featured in the corpus as well as vocabulary that are used infrequently in the corpus. In the default setting, the tool will highlight words in levels K1 to K6+. To change the tool to highlight words that are not in these levels, click on the “Highlight non-level words” option.

Input text



Level Frequency Band
K1 Top 1000 (Most frequent)
K21001-2000
K32001-3000
K43001-4000
K54001-5000
K6+5001+
Not in corpus