Empirical approach to understanding natural language models

Bhuvishi Bansal

doi:10.47611/jsrhs.v13i1.6103

Authors

Bhuvishi Bansal Kanchan Sharma

DOI:

https://doi.org/10.47611/jsrhs.v13i1.6103

Keywords:

NLP Models, Latent Dirichlet Allocation, Categorization, Performance Analysis, Model Evaluation

PDF

Abstract

In this paper, we try to understand natural language models by treating them as black boxes. We want to learn about these models without going into their technical details pertaining to network architecture, tuning parameters, training datasets, and schedules. We instead take an empirical approach, where we classify the datasets into various categories. For scalability and avoiding subjective bias, we use Latent Dirichlet Allocation (LDA) to categorize language text. We fine-tune and evaluate natural language models for our tasks. We compare the performance of the same model across multiple categories and for the same category across multiple models. This can help not only in choosing models for the desired categories but is also useful in understanding the model attributes that can explain performance variation. We report here the observations from this empirical study and our hypotheses. We find that models do not perform uniformly across all the categories, which could be because of uneven representation of these categories in their training datasets. Models that specialized/fine-tuned for specific tasks had higher variance in performance across categories than the generic models. Some categories have high performance consistently across all models, while others have high variance. The code for this research paper is available here: https://github.com/bhuvishi/llm_understanding

Downloads

References or Bibliography

Alsentzer, E. (2019, April 6). Publicly available clinical BERT embeddings. arXiv.org. https://arxiv.org/abs/1904.03323

Balestriero, R. (2021, October 18). Learning in high dimension always amounts to extrapolation. arXiv.org. https://arxiv.org/abs/2110.09485

Caselli, T., Basile, V., Mitrović, J., & Granitzer, M. (2021). HateBERT: Retraining BERT for Abusive Language Detection in English. Association for Computational Linguistics, 2021, 17–25. https://doi.org/10.18653/v1/2021.woah-1.3

Chalkidis, I., Fergadiotis, M., Malakasiotis, P., Αλέτρας, Ν., & Androutsopoulos, I. (2020). LEGAL-BERT: The Muppets straight out of Law School. Association for Computational Linguistics, 2020, 2898–2904. https://doi.org/10.18653/v1/2020.findings-emnlp.261

Devlin, J. (2018, October 11). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv.org. https://arxiv.org/abs/1810.04805

Grezes, F. (2021, December 1). Building astroBERT, a language model for Astronomy & Astrophysics. arXiv.org. https://arxiv.org/abs/2112.00590

Hazourli, A. R. (2022). FinancialBERT - A Pretrained Language Model for Financial Text Mining. ResearchGate. https://doi.org/10.13140/RG.2.2.34032.12803

Hoffman, M. (2010). Online learning for latent dirichlet allocation. https://papers.nips.cc/paper_files/paper/2010/hash/71f6278d140af599e06ad9bf1ba03cb0-Abstract.html

Liu, Y. (2019, July 26). Roberta: A robustly optimized BERT pretraining approach. arXiv.org. https://arxiv.org/abs/1907.11692

Loukas, L., Fergadiotis, M., Chalkidis, I., Spyropoulou, E., Malakasiotis, P., Androutsopoulos, I., & Παλιούρας, Γ. (2022). FINER: Financial Numeric Entity Recognition for XBRL Tagging. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers). https://doi.org/10.18653/v1/2022.acl-long.303

Lu, Y. (2022). Imitation Is Not Enough: Robustifying Imitation with Reinforcement Learning for Challenging Driving Scenarios. arXiv.org. https://arxiv.org/abs/2212.11419

Rajpurkar, P. (2018, June 11). Know what you don’t know: Unanswerable questions for SQUAD. arXiv.org. https://arxiv.org/abs/1806.03822

Rajpurkar, P. (2016, June 16). SQUAD: 100,000+ questions for machine comprehension of text. arXiv.org. https://arxiv.org/abs/1606.05250

Sanh, V. (2019, October 2). DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter. arXiv.org. https://arxiv.org/abs/1910.01108

Empirical approach to understanding natural language models

Authors

DOI:

Keywords:

Abstract

Downloads

References or Bibliography

Published

How to Cite

Issue

Section

Announcements

Call for Papers: Volume 14 Issue 3

ARTICLES
PUBLISHED

STUDENT
AUTHORS

YEARS
OF SERVICE

Empirical approach to understanding natural language models

Authors

DOI:

Keywords:

Abstract

Downloads

References or Bibliography

Published

How to Cite

Issue

Section

Announcements

Call for Papers: Volume 14 Issue 3

ARTICLESPUBLISHED

STUDENTAUTHORS

YEARSOF SERVICE

ARTICLES
PUBLISHED

STUDENT
AUTHORS

YEARS
OF SERVICE