CUET DIGITAL REPOSITORY

Identification of cyberbullying Bangla Linguistics Texts using deep learning and transformer based approaches.

Show simple item record

dc.contributor.author Saifullah, Md Khalid
dc.date.accessioned 2025-09-23T05:28:55Z
dc.date.available 2025-09-23T05:28:55Z
dc.date.issued 2024-07-24
dc.identifier.uri http://103.99.128.19:8080/xmlui/handle/123456789/521
dc.description Thesis in CSE en_US
dc.description.abstract In today's digital era, social media platforms such as Facebook, Twitter, and YouTube play crucial roles in facilitating idea expression and interpersonal connections. However, alongside increased connectivity, these platforms have inadvertently facilitated negative behaviors, notably cyberbullying. While extensive research has delved into cyberbullying in high-resource languages like English, there remains a significant dearth of resources for low-resource languages such as Bengali, Arabic, Tamil, and others, particularly concerning language modeling. This study aims to bridge this gap by developing a cyberbullying text identification system, named BullyFilterNeT, tailored specifically for social media texts, with Bengali serving as a test case. The intelligent BullyFilterNeT system effectively tackles challenges associated with Out-of-Vocabulary (OOV) words inherent in non-contextual embeddings and addresses the limitations of context-aware feature representations. To provide a comprehensive analysis, three non-contextual embedding models—GloVe, FastText, and Word2Vec—are developed for feature extraction in Bengali. These embedding models are integrated into classification models employing both statistical methods (SVM, SGD, Libsvm) and deep learning architectures (CNN, VDCNN, LSTM, GRU). Furthermore, the study utilizes six transformer-based language models; mBERT, bELECTRA, IndicBERT, XML-RoBERTa, DistilBERT, and BanglaBERT to overcome shortcomings observed in earlier models. Notably, the BanglaBERT-based BullyFilterNeT achieves the highest accuracy of 88.04% in our test set, demonstrating its efficacy in identifying cyberbullying text in the Bengali language. en_US
dc.language.iso en en_US
dc.publisher CUET en_US
dc.relation.ispartofseries TCD-64;T-352
dc.subject Cyberbullying; large language modelling; deep learning; transformers models; natural language processing (NLP); fine tuning; OOV; harmful messages en_US
dc.title Identification of cyberbullying Bangla Linguistics Texts using deep learning and transformer based approaches. en_US
dc.type Thesis en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account