Class of 2021, AI-011
Performance Analysis of YOLOv11 in Rice Sack Detection and Classification
This research aims to develop a deep learning-based object detection system using the YOLOv11 architecture to identify types of rice sacks: “Udang Premium,” “Udang Kuning,”‘ and “Udang Hitam”. The innovation of this study lies in exploring the impact of two different annotation methods, namely polygon and bounding box, on detection accuracy. The dataset comprised 169 images with 325 annotations, initially exhibiting significant class imbalance, particularly for the “Udang Kuning” class. The model was trained and evaluated using mAP (mAP), Precision, and Recall metrics. Experimenta results demonstrate excellent performance, with mAP@50 reaching 94% on the data validasi and 97% on the data pengujian. Per-class detection showed an Average Precision (AP) of 100% for “Udang Kuning” and “Udang Premium” on the data pengujian, and 91% for “Udang Hitam”. Confusion Matrix analysis confirmed a higl number of True Positives and minimal False Negatives, although some False Positives and one misclassification case were observed for the “Udang Hitam” class. A comparison with other object detection methods, namely RF-DETR (Base) and YOLOv12 (Fast), revealed that YOLOv11 consistently outperformed both in terms of mAP @50 (93.8% vs. 92.2% and 91.6%), Precision (95.8% vs. 95.1% and 93.5%). and Recall (92.6% vs. 81.0% and 89.7%). These findings affirm that YOLOv11 is a highly effective solution for rice sack detection, with the potential to enhance automation, and efficiency in identification and sorting processes within related industries.
Keywords: Object Detection, YOLOv11, Rice Sacks, Deep Learning, Computer Vision.
Class of 2021, AI-010
Hybrid Transformer-RNN Model for Classofication of Indonesian Regional Language
Transformer-based language models have been widely adopted in natural language processing tasks, yet their ability to capture sequential dependencies remains limited-particularly in Indonesia’s regional languages, which are low-resource and morphologically hybrid architecture that integrates rich. This study proposes a NusaBERT with BiLSTM and BiGRU layers on top of Transformer representations, while exploring various pooling strategies (CLS token, last hidden state, mean, and Experiments were conducted on three benchmark datasets-NusaParagraph max) Emotion, Rhetoric. Topic) NusaTranslation (Emotion, Sentiment), and Nusax Sentiment)-for multi-class and cross-lingual classification tasks. Results show that the hybrid models consistently outperform baselines, with the best variant NusaBERTLarge BiGRU, mean pooling, batch size 8) achieving macro FI scores of 77.09% (Emotion), 53.38% (Rhetoric), and 88.81% (Topic) on NusaParagraph, 71.03% (Emotion) and 88.71% (Sentiment on NusaTranslation; and 83.26% on NusaX (average across languages). Additional evaluation on the phenomenon of catastrophic forgetting shows that the hybrid model maintains more stable performance when sequential fine-tuning is applied across languages. These findings demonstrate that combining contextual representations from Transformers with the sequential modeling capabilities of RNNs can improve both performance and robustness in multilingual NLP scenarios, while supporting the development of more inclusive and adaptive language technologies for regional language preservation in Indonesia.
Keywords: Hybrid Model, Transformer, Classification, Catastrophic Forgetting Local Languages, Multilingual Text resource efficiency.
Class of 2021, AI-009
Hybrid Smote-Adaboost-Evolutionary Algorithm to Overcome Class Imbalancies and Noisy Attributes in Well Log Data for Lithology Prediction with K-Nearest Neighbor
Lithology prediction using well log data is a critical process in the oil and gas industry to accurately determine geological formations. Various machine learning methods including K-Nearest Neighbor (KNN), Random Forest, and Support Vector Machine have been applied to this task. While these methods offer advantages such as handling multi-class classification, they face significant challenges related to class imbalance and noisy attributes in the dataset. Class imbalance can cause bias toward majority classes, while noisy attributes reduce the model’s ability to detect relevant patterns thereby lowering prediction accuracy. Conventional approaches such as SMOTE (Synthetic Minority Oversampling Technique) to address class imbalance offer generate synthetic data that amplify noisy attributes, increasing the risk of overfitting. Furthermore, boosting methods like AdaBoost strengthen predictions by combining weak models but remain vulnerable to noise in the data. To address these limitations this study proposes the integration of Hybrid SMOTE, AdaBoost, and Evolutionary Algorithm. SMOTE is used to balance class distributions by generating more meaningful synthetic data. The Evolutionary Algorithm is applied for feature selection to minimize noisy attributes, while AdaBoost enhances the model’s robustness against overfitting. The proposed approach was tested on two public datasets, FORCE 2020 and KAGGLE. Experimental results show that this integration significantly improves prediction accuracy, achieving 96.75%, precision 88,83%, recall 84,55%, FI-Score 86,46% on the FORCE 2020 dataset and 76.37%, 70.1 1%, precision 57,06%, recall. 55,05%, F1-Score 55.86% on the KAGGLE dataset, outperforming conventional methods. This study aims to provide an innovative and robust solution to address geological data challenges, improve lithology prediction accuracy, and make substantial contributions to applications in the oil and gas industry
Keywords: Lithology Prediction, Well Log Data, Class Imbalance, Noisy Attributes SMOTE, AdaBoost, Evolutionary Algorithm, K-Nearest Neighbor (KNN), Geological Data Analysis
Class of 2021, AI-008
Integration of IndoBERT and Machine Learning Features to Improve the Performance of Indonesian Recognizing Textual Entailment
This research aims to develop a Recognizing Textual Entailment (RTE) model in the Indonesian language, named Hybrid-IndoBERT-RTE. The model is designed to address challenges in recognizing textual entailment, which is a critical task in Natural Language Processing (NLP). The architecture of Hybrid-IndoBERT-RTE is built by modifying IndoBERT-large-p1, a language model that has proven effective in various NLP tasks in the Indonesian language. In this modification, the output vectors generated by IndoBERT-large-p1 are combined with machine learning features from a feature rich classifiers, enabling the model to capture richer and deeper information The classification head of this model consists of 1 input layer, 3 hidden layers, 1 dropout layer, and 1 output layer, which are designed to enhance the model’s predictive performance. To test the model’s performance, this research uses the Wiki Revisions Edits Textual Entailment (WRETE) dataset, which consists of 450 data samples, with 300 data samples used for training, 50 for validation, and 100 for testing. Experimental results show that Hybrid-IndoBERT-RTE achieved an F1-score of 85%, indicating that the model has a strong capability in recognizing textual entailment in Indonesian In addition to good performance, the Hybrid-IndoBERT-RTE odel also demonstrates efficiency in computational resource usage. During the training process this model utilized Video Random Access Memory Graphics Processing Unit (VRAM GPU) resources 42 times more efficiently on average compared to IndoBERT-large- pl used in previous IndoNLU research. Moreover, the training time of this model is 44.44 times faster, allowing for quicker experimentation and more iterations. This efficiency is crucial in the context of RTE model development, where saving computational resources and training time can accelerate innovation and fiurther applications.
Keywords: Hybrid-IndoBERT-RTE, Wiki Revisions Edits Textual Entailment (WRETE), IndoBERT-large-p1, machine learning features, computational resource efficiency.
Class of 2021, AI-007
Implementation of Dimensionality Reduction on Word Embeddings Vector Generated by Bidirectional Encoder Representation From Transformers (BERT)
This research focuses on improving the efficiency of complex artificial intelligence models, such as BERT, by applying dimension reduction techniques. The BERT model has millions of parameters, resulting in high computational and memory requirements during training. The approach taken utilizes BERT as a foundation applying whitening or sphering techniques as a dimension reduction method at the feature extraction stage. Two scenarios are evaluated: a standard benchmark using the original BERT and a modified scenario involving BERT feature extraction, whitening techniques (PCA, ZCA, BERT Whitening), and classification using Bi-LSTM or MLP. The AG News dataset, containing news headlines and descriptions with four topic classes. is the main focus of the research. Results show that the model in the modified scenario, which combines BERT features, J. Su whitening, and a Bi-LSTM classifier, provides the best performance in terms of accuracy, F1 score, training time, and Graphics Processing Unit (GPU) memory usage. These findings indicate that whitening dimension reduction can improve text classification efficiency without sacrificing accuracy. This research is expected to expand AI applications in resource- constrained environments by improving the efficiency of complex models like BERT through parameter optimization and dimension reduction.
Link of Publication
Under Review
Class of 2021, AI-006
Development of Large Language Model to Answer Academic Related Questions at Syiah Kuala University using Fine-Tuning and Retrieval-Augmented Generation Methods
Right now, academic information at Universitas Syiah Kuala (USK) is distributed on a website or summarized in the form of Frequently Asked Questions (FAQ). Information in the form of a website and FAQ is not interactive. Certain information must be searched from the web or FAQ list. Therefore, a more interactive way to get information using a chatbot is needed. Chatbots can be built using a Large Language Model (LLM) such as Mistral 7B. Mistral 7B is a large language model that can be applied to answer questions such as academic information using data collected from universities. The fine-tuning method with the QLoRA and RAG techniques can be used to train the model and retrieve relevant information from external document sources. The results are then evaluated using the ROUGE score. The answers from the USK Mistral 7B model gave results with a score of >0.5 on 15 out of 56 questions using the RAG method, and the fine-tuning method was tested on 20 questions, producing a value with a score of >0.5. Testing was also conducted with different questions that had the same meaning, and response results were obtained with a ROUGE score of 0.4-0.5 from the questions asked. Using the USK Mistral 7B model in a chatbot, academic information at USK can be shared interactively
Keywords: Large Language Model, Fine-tuning, RAG
Class of 2022, AI-005
Hyperparameter Tuning Automation in YOLO (Case Study: Corn Leaf Disease)
Corn cultivation is pivotal in Southeast Asia, significantly contributing to regional food sccurity and cconomics. However, leaf discases posc a threat, leading to substantial losses in production and harvest quality. To tackle this issue, artificial intelligence (AI) technology is leveraged for early detection of corn leaf diseases. One effective approach is the use of YOLO (You Only Look Once) based object detection models. This study aims to automate the hyperparameter tuning process in YOLO models for corn leaf disease detection, focusing on improving model performance Through meticulous evaluation utilizing precision, recall, mAP50, and mAP50-95 metrics, the study identifies YOLOv8m and YOLO-NAS-L as top-performing models YOLOv8m excels in mAP50 (98.5%) and mAP50-95 (67.8%), while YOLO-NAS-L demonstrates superior detection capabilities with mAP50 (70.3%) and mAP50-95 (38.9%). These findings underscore the potential of advanced AI-driven detection Systems in revolutionizing crop management, facilitating early disease identification, and enabling prompt preventive measures. By leveraging sophisticated object detection models, farmers can enhance crop yields, mitigate losses due to plant diseases, and boost agricultural productivity. The research lays a solid foundation for developing integrated, scalable disease detection systems, offering crucial support for global food security and farmer welfare.
Keywords: Object detection, YOLOv8, YOLO-NAS, corn leaf disease, hyperparameter tuning
Class of 2022, AI-004
Performance Analysis of Deep Convolutional Neural Networks Architecture for Classification of Severity Score for Atopic Dermatitis Skin Disease
The purpose of this research is to develop a system that iS capable of detecting and recognizing text contained in Indonesian ID Card (e-KTP) images accurately and efficiently. The first stage of this research involves selecting an ideal image from the e-KTP dataset. Furthermore, the pre-processing stage is carried out to cut the edges of the image and combine it with the background image to create a more varied dataset and obtain to generate an e-KTP image mask as training data. The total training data after selecting the ideal image 144 images. The U-Net architecture is the choice as the deep learning method used in this study as an image segmentation process. Meanwhile, for text detection and recognition, the Character-Region Awareness For Text detection (CRAFT) and TRBA framework (TPS-ResNet- BiLSTM-Attention) is used. The testing conducted is assessed based on the accuracy, Dice coefficient, and IoU score for the segmentation process with the percentage test results of the U-Net model obtaining an accuracy of 99.52%. Meanwhile for the text detection and recognition follows the confidence score.
Keyword: e-KTP; U-Net network; CRAFT network; TRBA network; Optical character recognition
Class of 2022, AI-003
Enhancing the Prediction of Inhibitor Activity Against Hepatitis C Virus NS5B Through the Utilization of LightGBM and Bayesian Optimization
This study focuses on developing a predictive model for hepatitis C virus (HCV) NS5B inhibitor activity using the Light Gradient Boosting Machine (LightGBM) algorithm. The primary goal is to enhance the accuracy of inhibitor activity predictions, a crucial step in drug discovery for HCV. The study utilizes a molecular dataset comprising 3011 samples, collected from the ChEMBL database. This dataset is divided into 90% training data and 10% test data, resulting in 1503 compounds in the training process and 168 compounds for testing. The process of selecting molecular descriptors involved several stages, including selection based on variance values, multicollinearity, and Recursive Feature Elimination (RFE), resulting in the 50 most relevant molecular descriptors. The constructed LightGBM model employs Bayesian Optimization for hyperparameter tuning. Efforts to improve the model’s predictive performance involved combining several LightGBM models using a voting approach, with evaluation using the coefficient of determination (R’) and root mean squared error (RMSE). The model that has been built is used to predict test data. The results show an increase in the model’s predictive performance when the three LightGBM models are combined, proven by evaluation on test data which obtained the highest R2 value of 0.760 and the lowest RMSE of 0.637. Model validation was conducted through Y-Scrambling techniques, demonstrating that the model’s performance in predicting HCV NSSB inhibitor activity is based on real relationships and not coincidental. SHAP (SHapley Additive exPlanations) analysis was implemented to understand the contribution of each molecular descriptor to the model’s predictions. This analysis helped identify the most influential molecular descriptors, such as MDEC-33 and SpMaxl_Bh(e), providing insights into the molecular characteristics that play a role in inhibiting HCV NS5B. Utilizing SHAP visualizations, ncluding bar charts and bee swarm plots, offered deeper understanding of the influence of each descriptor on the model’s predictions. The study concludes that the combined approach of multiple LightGBM models, coupled with SHAP analysis, represents a significant advancement in predicting the activity of HCV NS5B inhibitors.
Keywords: QSAR, machine learning, drug discovery, model interpretation
Class of 2022, AI-002
The Deep Learning-Base Intelligent System for Extracting Information from E-KTP
The purpose of this research is to develop a system that iS capable of detecting and recognizing text contained in Indonesian ID Card (e-KTP) images accurately and efficiently. The first stage of this research involves selecting an ideal image from the e-KTP dataset. Furthermore, the pre-processing stage is carried out to cut the edges of the image and combine it with the background image to create a more varied dataset and obtain to generate an e-KTP image mask as training data. The total training data after selecting the ideal image 144 images. The U-Net architecture is the choice as the deep learning method used in this study as an image segmentation process. Meanwhile, for text detection and recognition, the Character-Region Awareness For Text detection (CRAFT) and TRBA framework (TPS-ResNet- BiLSTM-Attention) is used. The testing conducted is assessed based on the accuracy, Dice coefficient, and IoU score for the segmentation process with the percentage test results of the U-Net model obtaining an accuracy of 99.52%. Meanwhile for the text detection and recognition follows the confidence score.
Keyword: e-KTP; U-Net network; CRAFT network; TRBA network; Optical character recognition
Class of 2021, AI-001
Face Synthesis Using Modified Hyperstyle Architecture
Face recognition is the most stable and robust biometric technique for identifying and authenticating human faces. However, training face recognition models using a deep learing architecture requires large training images. In addition, labeling face image collections manually is a time-consuming and costly process. Augmenting face images using the HyperStyle architecture can solve these issues. Experimental results obtained using the ResNet-50 architecture trained on a dataset of 44,000 face images (the original FaceScrub dataset combined with the HyperStyle Age and HyperStyle Smile augmented images) demonstrate that the model achieved an F1-score of 79%, therefore outperforming the model trained using the original FaceScrub dataset without HyperStyle augmentation, i.e., F1-score of 63%. ResNet-50 with an FI-score of 82% outperformed other CNN models i.e., VGGNet-1 6 (60%), MobileNet V3 Small (65%). SEResNet18 (81%). HyperStyle modifications on the pre-processing and post- processing achieved promising results on model performance. ResNet-50 model trained using combination of the original FaceScrub dataset and the modified Hyper Style synthesized dataset obtained an F1-Score of 83%. Keywords: Deep Learning, Face Recognition, Data Augmentation, Hyperstyle ResNet-50
Department of Informatics, FMIPA Building, Block A, 3rd Floor
Syech Abdurrauf No. 3, Kopelma Darussalam, Banda Aceh, 23111, Indonesia
Copyright © 2022 Master Program in Artificial Intelligence Universitas Syiah Kuala. All Right Reserved.