LEXICAL DIVERSITY IN ACADEMIC AND NON-ACADEMIC TEXTS:  A COMPUTATIONAL COMPARISON

Authors

  • Muhammad Jawad Nasir MPhil Scholar, Department of English, National University of Modern Languages (NUML), Faisalabad Campus, Faisalabad, Punjab, Pakistan Author
  • Dr. Aftab Akram Lecturer, Department of English, National University of Modern Languages (NUML), Faisalabad Campus, Faisalabad, Punjab, Pakistan Author

DOI:

https://doi.org/10.5281/zenodo.19389743

Keywords:

Academic Writing, Corpus Linguistics, Density, Lexical Diversity, Non-Academic Writing, Stylistic Analysis, Type–Token Ratio

Abstract

This study investigated lexical diversity and lexical density in English academic and non-academic texts, aiming to provide a systematic comparison of vocabulary use and stylistic characteristics across registers. Using a comparative corpus-based design and quantitative analysis grounded in Lexical Diversity Theory, two balanced corpora were compiled: academic journal articles and non-academic texts, including blogs and online news. Lexical analysis focused on lexical density, Type–Token Ratio (TTR), and the distribution of content word categories (nouns, verbs, adjectives, and adverbs). The findings indicated that non-academic texts exhibited higher lexical density (63.63%) and greater lexical diversity (TTR = 0.20) compared to academic texts (58.61%; TTR = 0.16), reflecting broader vocabulary use in descriptive and narrative writing. Academic texts, by contrast, favored adjectives and repeated technical nouns, reflecting an analytical and informational focus, whereas non-academic texts emphasized verbs, supporting an action-oriented narrative style. These results demonstrated that register and communicative purpose significantly shaped lexical patterns, with practical implications for corpus linguistics, writing pedagogy, and register-based stylistic analysis.

 

Downloads

Download data is not yet available.

References

Baker, P., & McEnery, T. (Eds.). (2010). Corpora and language teaching (pp. 55–76). John Benjamins.

Biber, D. (1988). Variation across speech and writing. Cambridge University Press.

Biber, D. (2006). University language: A corpus-based study of spoken and written registers. John Benjamins.

Biber, D., Conrad, S., & Reppen, R. (1999). Corpus linguistics: Investigating language structure and use. Cambridge University Press.

Biber, D., Johansson, S., Leech, G., Conrad, S., & Finegan, E. (1999). Longman grammar of spoken and written English. Longman.

Covington, M. A., & McFall, J. D. (2010). Cutting the Gordian knot: The moving-average Type–Token Ratio (MATTR). Journal of Quantitative Linguistics, 17(2), 94–100. https://doi.org/10.1080/09296171003643062

Crossley, S. A., & McNamara, D. S. (2012). Text-based approaches to assessing writing quality. In C. A. Chapelle (Ed.), The encyclopedia of applied linguistics (pp. 5242–5254). Wiley-Blackwell.

Crossley, S. A., Salsbury, T., & McNamara, D. S. (2011). Predicting lexical proficiency in language learners using computational indices. Language Learning, 61(4), 1063–1092. https://doi.org/10.1111/j.1467-9922.2011.00655.x

Crossley, S. A., Salsbury, T., McCarthy, P. M., & McNamara, D. S. (2012). Investigating textual complexity in learner texts: Lexical and syntactic dimensions. Reading and Writing, 25(3), 641–668. https://doi.org/10.1007/s11145-010-9261-4

Granger, S., & Paquot, M. (2008). Discourse profiling and the identification of learner corpus characteristics. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research (pp. 264–280). Cambridge University Press.

Granger, S., & Paquot, M. (2010). Electronic lexicography and learner corpora: Vocabulary analysis. In P. Baker & T. McEnery (Eds.), Corpora and language teaching (pp. 55–76). John Benjamins.

Gries, S. T. (2008). Corpus linguistics: A guide to methods and practice. Cambridge University Press.

Halliday, M. A. K. (1985). An introduction to functional grammar (2nd ed.). Arnold.

Halliday, M. A. K., & Matthiessen, C. M. I. M. (2004). An introduction to functional grammar (3rd ed.). Hodder Arnold.

Heatley, A., Nation, I. S. P., & Coxhead, A. (2002). Range: A program for the analysis of vocabulary in texts. University of Melbourne.

Hyland, K. (2002). Academic discourse: English in a global context. Continuum.

Laufer, B. (2005). Lexical frequency profiles: The effects of different text types on vocabulary distribution. Applied Linguistics, 26(3), 301–322. https://doi.org/10.1093/applin/ami014

Laufer, B., & Nation, I. S. P. (1995). Vocabulary size and use: Lexical richness in L2 written production. Applied Linguistics, 16(3), 307–322. https://doi.org/10.1093/applin/16.3.307

McCarthy, M. (2005). Vocabulary and language teaching. Cambridge University Press.

McCarthy, P. M., & Jarvis, S. (2010). MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods, 42(2), 381–392. https://doi.org/10.3758/BRM.42.2.381

McEnery, T., & Hardie, A. (2012). Corpus linguistics: Method, theory and practice. Cambridge University Press.

Nation, I. S. P. (2001). Learning vocabulary in another language. Cambridge University Press.

Nation, I. S. P., & Webb, S. (2011). Researching and analyzing vocabulary. Heinle Cengage Learning.

Tweedie, F., & Baayen, R. H. (1998). How variable may a constant be? Measures of lexical richness in perspective. Computers and the Humanities, 32(5), 323–352. https://doi.org/10.1023/A:1000201622716

Ure, J. (1971). Lexical density and register differentiation. In R. J. Watts (Ed.), Applications of linguistics (pp. 443–452). Edinburgh University Press.

Published

2026-03-31

How to Cite

Muhammad Jawad Nasir, & Dr. Aftab Akram. (2026). LEXICAL DIVERSITY IN ACADEMIC AND NON-ACADEMIC TEXTS:  A COMPUTATIONAL COMPARISON. International Premier Journal of Languages & Literature, 4(3), 6-26. https://doi.org/10.5281/zenodo.19389743