Search results

1 – 10 of over 4000
Article
Publication date: 7 November 2016

Mohammadali Abedini, Farzaneh Ahmadzadeh and Rassoul Noorossana

A crucial decision in financial services is how to classify credit or loan applicants into good and bad applicants. The purpose of this paper is to propose a four-stage hybrid…

Abstract

Purpose

A crucial decision in financial services is how to classify credit or loan applicants into good and bad applicants. The purpose of this paper is to propose a four-stage hybrid data mining approach to support the decision-making process.

Design/methodology/approach

The approach is inspired by the bagging ensemble learning method and proposes a new voting method, namely two-level majority voting in the last stage. First some training subsets are generated. Then some different base classifiers are tuned and afterward some ensemble methods are applied to strengthen tuned classifiers. Finally, two-level majority voting schemes help the approach to achieve more accuracy.

Findings

A comparison of results shows the proposed model outperforms powerful single classifiers such as multilayer perceptron (MLP), support vector machine, logistic regression (LR). In addition, it is more accurate than ensemble learning methods such as bagging-LR or rotation forest (RF)-MLP. The model outperforms single classifiers in terms of type I and II errors; it is close to some ensemble approaches such as bagging-LR and RF-MLP but fails to outperform them in terms of type I and II errors. Moreover, majority voting in the final stage provides more reliable results.

Practical implications

The study concludes the approach would be beneficial for banks, credit card companies and other credit provider organisations.

Originality/value

A novel four stages hybrid approach inspired by bagging ensemble method proposed. Moreover the two-level majority voting in two different schemes in the last stage provides more accuracy. An integrated evaluation criterion for classification errors provides an enhanced insight for error comparisons.

Details

Kybernetes, vol. 45 no. 10
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 6 June 2024

Özge H. Namlı, Seda Yanık, Aslan Erdoğan and Anke Schmeink

Coronary artery disease is one of the most common cardiovascular disorders in the world, and it can be deadly. Traditional diagnostic approaches are based on angiography, which is…

Abstract

Purpose

Coronary artery disease is one of the most common cardiovascular disorders in the world, and it can be deadly. Traditional diagnostic approaches are based on angiography, which is an interventional procedure having side effects such as contrast nephropathy or radio exposure as well as significant expenses. The purpose of this paper is to propose a novel artificial intelligence (AI) approach for the diagnosis of coronary artery disease as an effective alternative to traditional diagnostic methods.

Design/methodology/approach

In this study, a novel ensemble AI approach based on optimization and classification is proposed. The proposed ensemble structure consists of three stages: feature selection, classification and combining. In the first stage, important features for each classification method are identified using the binary particle swarm optimization algorithm (BPSO). In the second stage, individual classification methods are used. In the final stage, the prediction results obtained from the individual methods are combined in an optimized way using the particle swarm optimization (PSO) algorithm to achieve better predictions.

Findings

The proposed method has been tested using an up-to-date real dataset collected at Basaksehir Çam and Sakura City Hospital. The data of disease prediction are unbalanced. Hence, the proposed ensemble approach improves majorly the F-measure and ROC area which are more prominent measures in case of unbalanced classification. The comparison shows that the proposed approach improves the F-measure and ROC area results of the individual classification methods around 14.5% in average and diagnoses with an accuracy rate of 96%.

Originality/value

This study presents a low-cost and low-risk AI-based approach for diagnosing heart disease compared to traditional diagnostic methods. Most of the existing research studies focus on base classification methods. In this study, we mainly investigate an effective ensemble method that uses optimization approaches for feature selection and combining stages for the medical diagnostic domain. Furthermore, the approaches in the literature are commonly tested on open-access dataset in heart disease diagnoses, whereas we apply our approach on a real and up-to-date dataset.

Details

International Journal of Intelligent Computing and Cybernetics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 7 March 2016

Stephan Körner and Frank Holzäpfel

Wake vortices that are generated by an aircraft as a consequence of lift constitute a potential danger to the following aircraft. To predict and avoid dangerous situations, wake…

Abstract

Purpose

Wake vortices that are generated by an aircraft as a consequence of lift constitute a potential danger to the following aircraft. To predict and avoid dangerous situations, wake vortex transport and decay models have been developed. Being based on different model physics, they can complement each other with their individual strengths. This paper investigates the skill of a Multi-Model Ensemble (MME) approach to improve prediction performance. Therefore, this paper aims to use wake vortex models developed by NASA (APA3.2, APA3.4, TDP2.1) and by DLR (P2P). Furthermore, this paper analyzes the possibility to use the ensemble spread to compute uncertainty envelopes.

Design/methodology/approach

An MME approach called Reliability Ensemble Averaging (REA) is adapted and used to the wake vortex predictions. To train the ensemble, a set of wake vortex measurements accomplished at the airports of Frankfurt (WakeFRA), Munich (WakeMUC) and at a special airport Oberpfaffenhofen was applied.

Findings

The REA approach can outperform the best member of the ensemble, on average, regarding the root-mean-square error. Moreover, the ensemble delivers reasonable uncertainty envelopes.

Practical implications

Reliable wake vortex predictions may be applicable for both tactical optimization of aircraft separation at airports and airborne wake vortex prediction and avoidance.

Originality/value

Ensemble approaches are widely used in weather forecasting, but they have never been applied to wake vortex predictions. Until today, the uncertainty envelopes for wake vortex forecasts have been computed among others from perturbed initial conditions or perturbed physics as well as from uncertainties from environmental conditions or from safety margins but not from the spread of structurally independent model forecasts.

Details

Aircraft Engineering and Aerospace Technology: An International Journal, vol. 88 no. 2
Type: Research Article
ISSN: 1748-8842

Keywords

Article
Publication date: 10 March 2022

Jayaram Boga and Dhilip Kumar V.

For achieving the profitable human activity recognition (HAR) method, this paper solves the HAR problem under wireless body area network (WBAN) using a developed ensemble learning…

102

Abstract

Purpose

For achieving the profitable human activity recognition (HAR) method, this paper solves the HAR problem under wireless body area network (WBAN) using a developed ensemble learning approach. The purpose of this study is,to solve the HAR problem under WBAN using a developed ensemble learning approach for achieving the profitable HAR method. There are three data sets used for this HAR in WBAN, namely, human activity recognition using smartphones, wireless sensor data mining and Kaggle. The proposed model undergoes four phases, namely, “pre-processing, feature extraction, feature selection and classification.” Here, the data can be preprocessed by artifacts removal and median filtering techniques. Then, the features are extracted by techniques such as “t-Distributed Stochastic Neighbor Embedding”, “Short-time Fourier transform” and statistical approaches. The weighted optimal feature selection is considered as the next step for selecting the important features based on computing the data variance of each class. This new feature selection is achieved by the hybrid coyote Jaya optimization (HCJO). Finally, the meta-heuristic-based ensemble learning approach is used as a new recognition approach with three classifiers, namely, “support vector machine (SVM), deep neural network (DNN) and fuzzy classifiers.” Experimental analysis is performed.

Design/methodology/approach

The proposed HCJO algorithm was developed for optimizing the membership function of fuzzy, iteration limit of SVM and hidden neuron count of DNN for getting superior classified outcomes and to enhance the performance of ensemble classification.

Findings

The accuracy for enhanced HAR model was pretty high in comparison to conventional models, i.e. higher than 6.66% to fuzzy, 4.34% to DNN, 4.34% to SVM, 7.86% to ensemble and 6.66% to Improved Sealion optimization algorithm-Attention Pyramid-Convolutional Neural Network-AP-CNN, respectively.

Originality/value

The suggested HAR model with WBAN using HCJO algorithm is accurate and improves the effectiveness of the recognition.

Details

International Journal of Pervasive Computing and Communications, vol. 19 no. 4
Type: Research Article
ISSN: 1742-7371

Keywords

Article
Publication date: 18 October 2022

Hasnae Zerouaoui, Ali Idri and Omar El Alaoui

Hundreds of thousands of deaths each year in the world are caused by breast cancer (BC). An early-stage diagnosis of this disease can positively reduce the morbidity and mortality…

Abstract

Purpose

Hundreds of thousands of deaths each year in the world are caused by breast cancer (BC). An early-stage diagnosis of this disease can positively reduce the morbidity and mortality rate by helping to select the most appropriate treatment options, especially by using histological BC images for the diagnosis.

Design/methodology/approach

The present study proposes and evaluates a novel approach which consists of 24 deep hybrid heterogenous ensembles that combine the strength of seven deep learning techniques (DenseNet 201, Inception V3, VGG16, VGG19, Inception-ResNet-V3, MobileNet V2 and ResNet 50) for feature extraction and four well-known classifiers (multi-layer perceptron, support vector machines, K-nearest neighbors and decision tree) by means of hard and weighted voting combination methods for histological classification of BC medical image. Furthermore, the best deep hybrid heterogenous ensembles were compared to the deep stacked ensembles to determine the best strategy to design the deep ensemble methods. The empirical evaluations used four classification performance criteria (accuracy, sensitivity, precision and F1-score), fivefold cross-validation, Scott–Knott (SK) statistical test and Borda count voting method. All empirical evaluations were assessed using four performance measures, including accuracy, precision, recall and F1-score, and were over the histological BreakHis public dataset with four magnification factors (40×, 100×, 200× and 400×). SK statistical test and Borda count were also used to cluster the designed techniques and rank the techniques belonging to the best SK cluster, respectively.

Findings

Results showed that the deep hybrid heterogenous ensembles outperformed both their singles and the deep stacked ensembles and reached the accuracy values of 96.3, 95.6, 96.3 and 94 per cent across the four magnification factors 40×, 100×, 200× and 400×, respectively.

Originality/value

The proposed deep hybrid heterogenous ensembles can be applied for the BC diagnosis to assist pathologists in reducing the missed diagnoses and proposing adequate treatments for the patients.

Details

Data Technologies and Applications, vol. 57 no. 2
Type: Research Article
ISSN: 2514-9288

Keywords

Article
Publication date: 29 July 2014

Chih-Fong Tsai and Chihli Hung

Credit scoring is important for financial institutions in order to accurately predict the likelihood of business failure. Related studies have shown that machine learning…

1139

Abstract

Purpose

Credit scoring is important for financial institutions in order to accurately predict the likelihood of business failure. Related studies have shown that machine learning techniques, such as neural networks, outperform many statistical approaches to solving this type of problem, and advanced machine learning techniques, such as classifier ensembles and hybrid classifiers, provide better prediction performance than single machine learning based classification techniques. However, it is not known which type of advanced classification technique performs better in terms of financial distress prediction. The paper aims to discuss these issues.

Design/methodology/approach

This paper compares neural network ensembles and hybrid neural networks over three benchmarking credit scoring related data sets, which are Australian, German, and Japanese data sets.

Findings

The experimental results show that hybrid neural networks and neural network ensembles outperform the single neural network. Although hybrid neural networks perform slightly better than neural network ensembles in terms of predication accuracy and errors with two of the data sets, there is no significant difference between the two types of prediction models.

Originality/value

The originality of this paper is in comparing two types of advanced classification techniques, i.e. hybrid and ensemble learning techniques, in terms of financial distress prediction.

Details

Kybernetes, vol. 43 no. 7
Type: Research Article
ISSN: 0368-492X

Keywords

Article
Publication date: 23 March 2021

Mostafa El Habib Daho, Nesma Settouti, Mohammed El Amine Bechar, Amina Boublenza and Mohammed Amine Chikh

Ensemble methods have been widely used in the field of pattern recognition due to the difficulty of finding a single classifier that performs well on a wide variety of problems…

Abstract

Purpose

Ensemble methods have been widely used in the field of pattern recognition due to the difficulty of finding a single classifier that performs well on a wide variety of problems. Despite the effectiveness of these techniques, studies have shown that ensemble methods generate a large number of hypotheses and that contain redundant classifiers in most cases. Several works proposed in the state of the art attempt to reduce all hypotheses without affecting performance.

Design/methodology/approach

In this work, the authors are proposing a pruning method that takes into consideration the correlation between classifiers/classes and each classifier with the rest of the set. The authors have used the random forest algorithm as trees-based ensemble classifiers and the pruning was made by a technique inspired by the CFS (correlation feature selection) algorithm.

Findings

The proposed method CES (correlation-based Ensemble Selection) was evaluated on ten datasets from the UCI machine learning repository, and the performances were compared to six ensemble pruning techniques. The results showed that our proposed pruning method selects a small ensemble in a smaller amount of time while improving classification rates compared to the state-of-the-art methods.

Originality/value

CES is a new ordering-based method that uses the CFS algorithm. CES selects, in a short time, a small sub-ensemble that outperforms results obtained from the whole forest and the other state-of-the-art techniques used in this study.

Details

International Journal of Intelligent Computing and Cybernetics, vol. 14 no. 2
Type: Research Article
ISSN: 1756-378X

Keywords

Article
Publication date: 7 July 2020

Jiaming Liu, Liuan Wang, Linan Zhang, Zeming Zhang and Sicheng Zhang

The primary objective of this study was to recognize critical indicators in predicting blood glucose (BG) through data-driven methods and to compare the prediction performance of…

Abstract

Purpose

The primary objective of this study was to recognize critical indicators in predicting blood glucose (BG) through data-driven methods and to compare the prediction performance of four tree-based ensemble models, i.e. bagging with tree regressors (bagging-decision tree [Bagging-DT]), AdaBoost with tree regressors (Adaboost-DT), random forest (RF) and gradient boosting decision tree (GBDT).

Design/methodology/approach

This study proposed a majority voting feature selection method by combining lasso regression with the Akaike information criterion (AIC) (LR-AIC), lasso regression with the Bayesian information criterion (BIC) (LR-BIC) and RF to select indicators with excellent predictive performance from initial 38 indicators in 5,642 samples. The selected features were deployed to build the tree-based ensemble models. The 10-fold cross-validation (CV) method was used to evaluate the performance of each ensemble model.

Findings

The results of feature selection indicated that age, corpuscular hemoglobin concentration (CHC), red blood cell volume distribution width (RBCVDW), red blood cell volume and leucocyte count are five most important clinical/physical indicators in BG prediction. Furthermore, this study also found that the GBDT ensemble model combined with the proposed majority voting feature selection method is better than other three models with respect to prediction performance and stability.

Practical implications

This study proposed a novel BG prediction framework for better predictive analytics in health care.

Social implications

This study incorporated medical background and machine learning technology to reduce diabetes morbidity and formulate precise medical schemes.

Originality/value

The majority voting feature selection method combined with the GBDT ensemble model provides an effective decision-making tool for predicting BG and detecting diabetes risk in advance.

Open Access
Article
Publication date: 13 August 2020

Mariam AlKandari and Imtiaz Ahmad

Solar power forecasting will have a significant impact on the future of large-scale renewable energy plants. Predicting photovoltaic power generation depends heavily on climate…

10952

Abstract

Solar power forecasting will have a significant impact on the future of large-scale renewable energy plants. Predicting photovoltaic power generation depends heavily on climate conditions, which fluctuate over time. In this research, we propose a hybrid model that combines machine-learning methods with Theta statistical method for more accurate prediction of future solar power generation from renewable energy plants. The machine learning models include long short-term memory (LSTM), gate recurrent unit (GRU), AutoEncoder LSTM (Auto-LSTM) and a newly proposed Auto-GRU. To enhance the accuracy of the proposed Machine learning and Statistical Hybrid Model (MLSHM), we employ two diversity techniques, i.e. structural diversity and data diversity. To combine the prediction of the ensemble members in the proposed MLSHM, we exploit four combining methods: simple averaging approach, weighted averaging using linear approach and using non-linear approach, and combination through variance using inverse approach. The proposed MLSHM scheme was validated on two real-time series datasets, that sre Shagaya in Kuwait and Cocoa in the USA. The experiments show that the proposed MLSHM, using all the combination methods, achieved higher accuracy compared to the prediction of the traditional individual models. Results demonstrate that a hybrid model combining machine-learning methods with statistical method outperformed a hybrid model that only combines machine-learning models without statistical method.

Details

Applied Computing and Informatics, vol. ahead-of-print no. ahead-of-print
Type: Research Article
ISSN: 2634-1964

Keywords

Article
Publication date: 6 February 2017

Aytug Onan

The immense quantity of available unstructured text documents serve as one of the largest source of information. Text classification can be an essential task for many purposes in…

Abstract

Purpose

The immense quantity of available unstructured text documents serve as one of the largest source of information. Text classification can be an essential task for many purposes in information retrieval, such as document organization, text filtering and sentiment analysis. Ensemble learning has been extensively studied to construct efficient text classification schemes with higher predictive performance and generalization ability. The purpose of this paper is to provide diversity among the classification algorithms of ensemble, which is a key issue in the ensemble design.

Design/methodology/approach

An ensemble scheme based on hybrid supervised clustering is presented for text classification. In the presented scheme, supervised hybrid clustering, which is based on cuckoo search algorithm and k-means, is introduced to partition the data samples of each class into clusters so that training subsets with higher diversities can be provided. Each classifier is trained on the diversified training subsets and the predictions of individual classifiers are combined by the majority voting rule. The predictive performance of the proposed classifier ensemble is compared to conventional classification algorithms (such as Naïve Bayes, logistic regression, support vector machines and C4.5 algorithm) and ensemble learning methods (such as AdaBoost, bagging and random subspace) using 11 text benchmarks.

Findings

The experimental results indicate that the presented classifier ensemble outperforms the conventional classification algorithms and ensemble learning methods for text classification.

Originality/value

The presented ensemble scheme is the first to use supervised clustering to obtain diverse ensemble for text classification

Details

Kybernetes, vol. 46 no. 2
Type: Research Article
ISSN: 0368-492X

Keywords

1 – 10 of over 4000