Bayesian Computation Statistics
Rashin Nimaei; Farzad Eskandari
Abstract
The recent advancements in technology have faced an increase in the growth rate of data.According to the amount of data generated, ensuring effective analysis using traditional approaches becomes very complicated.One of the methods of managing and analyzing big data ...
Read More
The recent advancements in technology have faced an increase in the growth rate of data.According to the amount of data generated, ensuring effective analysis using traditional approaches becomes very complicated.One of the methods of managing and analyzing big data is classification.%One of the data mining methods used commonly and effectively to classify big data is the MapReduceIn this paper, the feature weighting technique to improve Bayesian classification algorithms for big data is developed based on Correlative Naive Bayes classifier and MapReduce Model.%Classification models include Naive Bayes classifier, correlated Naive Bayes and correlated Naive Bayes with feature weighting.Correlated Naive Bayes classification is a generalization of the Naive Bayes classification model by considering the dependence between features.%This paper uses the feature weighting technique and Laplace calibration to improve the correlated Naive Bayes classification.The performance of all described methods are evaluated by considering accuracy, sensitivity and specificity, accuracy, sensitivity and specificity metrics.
Bayesian Computation Statistics
Ehsan Ormoz; Farzad Eskandari
Abstract
This paper introduces a novel semiparametric Bayesian approach for bivariate meta-regression. The method extends traditional binomial models to trinomial distributions, accounting for positive, neutral, and negative treatment effects. Using a conditional Dirichlet process, we develop a model to compare ...
Read More
This paper introduces a novel semiparametric Bayesian approach for bivariate meta-regression. The method extends traditional binomial models to trinomial distributions, accounting for positive, neutral, and negative treatment effects. Using a conditional Dirichlet process, we develop a model to compare treatment and control groups across multiple clinical centers. This approach addresses the challenges posed by confounding factors in such studies. The primary objective is to assess treatment efficacy by modeling response outcomes as trinomial distributions. We employ Gibbs sampling and the Metropolis-Hastings algorithm for posterior computation. These methods generate estimates of treatment effects while incorporating auxiliary variables that may influence outcomes. Simulations across various scenarios demonstrate the model’s effectiveness. We also establish credible intervals to evaluate hypotheses related to treatment effects. Furthermore, we apply the methodology to real-world data on economic activity in Iran from 2009 to 2021. This application highlights the practical utility of our approach in meta-analytic contexts. Our research contributes to the growing body of literature on Bayesian methods in meta-analysis. It provides valuable insights for improving clinical study evaluations.
Machine Learning
Sahar Abbasi; Radmin Sadeghian; Maryam Hamedi
Abstract
Multi-label classification assigns multiple labels to each instance, crucial for tasks like cancer detection in images and text categorization. However, machine learning methods often struggle with the complexity of real-life datasets. To improve efficiency, researchers have developed feature selection ...
Read More
Multi-label classification assigns multiple labels to each instance, crucial for tasks like cancer detection in images and text categorization. However, machine learning methods often struggle with the complexity of real-life datasets. To improve efficiency, researchers have developed feature selection methods to identify the most relevant features. Traditional methods, requiring all features upfront, fail in dynamic environments like media platforms with continuous data streams. To address this, novel online methods have been created, yet they often neglect optimizing conflicting objectives. This study introduces an objective search approach using mutual information, feature interaction, and the NSGA-II algorithm to select relevant features from streaming data. The strategy aims to minimize feature overlap, maximize relevance to labels, and optimize online feature interaction analysis. By applying a modified NSGA-II algorithm, a set of non-dominantsolutions is identified. Experiments on eleven datasets show that the proposed approach outperforms advanced online feature selection techniques in predictive accuracy, statistical analysis, and stability assessment.
Machine Learning
Mohammad Zahaby; Iman Makhdoom
Abstract
Breast cancer (BC) is one of the leading causes of death in women worldwide. Early diagnosis of this disease can save many women’s lives. The Breast Imaging Reporting and Data System (BIRADS) is a standard method developed by the American College of Radiology (ACR). However, physicians have had ...
Read More
Breast cancer (BC) is one of the leading causes of death in women worldwide. Early diagnosis of this disease can save many women’s lives. The Breast Imaging Reporting and Data System (BIRADS) is a standard method developed by the American College of Radiology (ACR). However, physicians have had a lot of contradictions in determining the value of BIRADS, and all aspects of patients have not been considered in diagnosing this disease using the methods that have been used so far. In this article, a novel decision support system (DSS) has been presented. In the proposed DSS, firstly, c-mean clustering was used to determine the molecular subtype for patients who did not have this value by combining the mammography reports processing along with hospital information systems (HIS) obtained from their electronic files. Then several classifiers such as convolutional neural networks (CNN), decision tree (DT), multi-level fuzzy min-max neural network (MLF), multi-class support vector machine (SVM), and XGboost were trained to determine the BIRADS. Finally, the values obtained by these classifiers were combined using weighted ensemble learning with the majority voting algorithm to obtain the appropriate value of BIRADS. This helps physicians in the early diagnosis of BC. Finally, the results were evaluated in terms of accuracy, specificity, sensitivity, positive predicted value (PPV), negative predicted value (NPV), and f1-measure by the confusion matrix. The obtained values were, 97.94%, 98.79%, 92.08%, 92.34%, 98.80%, and 92.19% respectively.
Bayesian Computation Statistics
Mahdieh Bayati
Abstract
This study generalizes the joint empirical likelihood (JEL) which is named the joint penalized empirical likelihood(JPEL) and presents a comparative analysis of two innovative empirical likelihood methods: the restricted penalized empirical likelihood (RPEL) and the joint penalized empirical likelihood. ...
Read More
This study generalizes the joint empirical likelihood (JEL) which is named the joint penalized empirical likelihood(JPEL) and presents a comparative analysis of two innovative empirical likelihood methods: the restricted penalized empirical likelihood (RPEL) and the joint penalized empirical likelihood. These methods extend traditional empirical likelihood approaches by integrating criteria based on the minimum variance and unbiasedness of the estimator equations. In RPEL, estimators are obtained under these two criteria, while JPEL facilitates the joint application of the estimator equations used in RPEL, allowing for broader applicability.\\We evaluate the effectiveness of RPEL and RJEL in regression models through simulation studies, and evaluate the performance of RPEL and JPEL, focusing on parameter accuracy, model selection (as measured by the Empirical Bayesian Information Criterion), predictive accuracy (Mean Square Error), and robustness to outliers. Results indicate that RPEL consistently outperforms JPEL across all criteria, with RPEL yielding simpler models and more reliable estimates, particularly as sample sizes increase. These findings suggest that RPEL provides greater stability and interpretability for regression models, making it a superior choice over JPEL for the scenarios tested in this study.
Neural Network
Mohammad Hossein Zolfagharnasab; Latifeh PourMohammadBagher; Mohammad Bahrani
Abstract
This study introduces a tailored recommendation system aimed at enriching Iran’s tourism sector. Using a hybrid model that combines neural collaborative filtering (NCF) with matrix factorization (MF), our approach leverages both demographic and contextual data of combined tourist-landmark (4177 ...
Read More
This study introduces a tailored recommendation system aimed at enriching Iran’s tourism sector. Using a hybrid model that combines neural collaborative filtering (NCF) with matrix factorization (MF), our approach leverages both demographic and contextual data of combined tourist-landmark (4177 samples) to provide personalized touristic recommendations. Empirical evaluations on the implemented methods show that the hybrid model outperforms factorization techniques, achieving a test F1 score of 0.84, accuracy of 0.90, and a test error reduction from 0.83 to 0.37. Feature vector integration further improved test recall by 17%, underscoring the model's robustness in capturing user-item relationships. Further analysis using t-SNE as well as visual analyses of embedding structures confirm the systems ability to generalize patterns in latent space; thereby, mitigating cold-start problem for new tourists or unvisited landmarks. This study also contributes a structured dataset of Iranian landmarks, user ratings, and supplementary contextual data for fostering future research in culturally specific intelligent recommender systems. For implementation details, refer to the GitHub repository at https://github.com/MsainZn/Collaborative_Filtering_Tourism_Landmarks.
Statistical Simulation
Shaghayegh Molaei; Kianoush Fathi Vajargah; Hamid Mottaghi Golshan
Abstract
This article examines the probability structure and dependency structure of a new family of Archimedean copula functions that are generated with two generators; this family is known as a generalization of the Archimedean copula functions and provides more tail dependence properties than the Archimedean ...
Read More
This article examines the probability structure and dependency structure of a new family of Archimedean copula functions that are generated with two generators; this family is known as a generalization of the Archimedean copula functions and provides more tail dependence properties than the Archimedean family, making it more applicable. Using simulations, we compare a member of this family with various existing copula functions to highlight similarities and differences, and if the desired copula's scatter plot in terms of tail dependence is similar to the generalized Archimedean copula, we can fit the generalized Archimedean copula function to it.\\Applications of this copula in the financial domain are demonstrated to improve the study of the dependence between indicators and to use this copula's advantageous characteristics. These theoretical concepts are validated by the numerical example provided at the end of the paper.This article examines the probability structure and dependency structure of a new family of Archimedean copula functions that are generated with two generators; this family is known as a generalization of the Archimedean copula functions and provides more tail dependence properties than the Archimedean family, making it more applicable. Using simulations, we compare a member of this family with various existing copula functions to highlight similarities and differences, and if the desired copula's scatter plot in terms of tail dependence is similar to the generalized Archimedean copula, we can fit the generalized Archimedean copula function to it.\\
Neural Network
Najmeh Jabbari Diziche
Abstract
Parkinson's disease (PD) is a common neurological disorder that has a significant impact on the elderly population worldwide. This study investigates the use of deep learning models, including VGG16, ResNet50, and a simple CNN, in classifying MRI images to distinguish between Parkinson's patients and ...
Read More
Parkinson's disease (PD) is a common neurological disorder that has a significant impact on the elderly population worldwide. This study investigates the use of deep learning models, including VGG16, ResNet50, and a simple CNN, in classifying MRI images to distinguish between Parkinson's patients and normal subjects. The relevant data includes 610 normal subjects and 221 Parkinson subjects. Using ensemble learning techniques with support vector machine (SVM) as a sub-trainer, our model achieved 96% classification accuracy. Applying various hybrid methods such as majority vote, weighted average, and weighted majority vote on the outputs of base learning models helped us achieve a much more improved performance and reduce variability in classification results. These findings promise progress in the accurate diagnosis of Parkinson's disease using deep learning methods in medical imaging. To confirm the practicality of the attained results of the proposed diagnostic approach, further multicenter studies with larger patient groups are recommended.
Bayesian Network
Vahid Rezaei Tabar; Mohaddeseh Safakish
Abstract
In the modern era, detecting credit card fraud has become a crucial concern from both financial and security standpoints. Given the rarity of fraudulent activities, the issue is reframed as a binary classification challenge, tackling the complexities of imbalanced datasets. To address this, authors advocate ...
Read More
In the modern era, detecting credit card fraud has become a crucial concern from both financial and security standpoints. Given the rarity of fraudulent activities, the issue is reframed as a binary classification challenge, tackling the complexities of imbalanced datasets. To address this, authors advocate using Bayesian networks due to their theoretical robustness and capacity to model intricate scenarios while maintaining interpretability in the context of class skewed distributions. A pivotal component of this meta learning framework is the cost matrix, leading authors to explore various techniques for its calculation. By employing our meta-learning framework with data from Iran’s banking system, the authors demonstrate a method for determining the cost matrix. Subsequently, develop the corresponding Cost Augmented Bayesian Network Classifiers, called CABNCs. The outcomes highlight the potential of CATAN to diminish financial loss and the effectiveness of CAGHC-K2 in predicting labels for forthcoming transactions in the context of class imbalance.
Mathematical Computing
Zahra Behdani; Majid Darehmiraki
Abstract
The Fuzzy K-Nearest Neighbour (FKNN) method is a classification approach that integrates fuzzy theories with the K-Nearest Neighbour classifier. The algorithm computes the degree of membership for a given dataset within each class and then chooses the class with the highest degree of membership as the ...
Read More
The Fuzzy K-Nearest Neighbour (FKNN) method is a classification approach that integrates fuzzy theories with the K-Nearest Neighbour classifier. The algorithm computes the degree of membership for a given dataset within each class and then chooses the class with the highest degree of membership as the assigned classification outcome. This algorithm has several applications in regression problems. When the mathematical model of the data is not known, this method can be used to estimate and approximate the value of the response variable. This paper introduces a method, which incorporates a parameter distance measure to empower decision makers to make precise selections across several levels. Furthermore, we provide an analysis of the algorithm's strengths and shortcomings, as well as a comprehensive explanation of the distinctions between the closest neighbour approach in tasks of classification and regression. Finally, to further elucidate the principles, we present a range of examples that demonstrate the application of closest neighbour algorithms in the classification and regression of fuzzy numbers.
Bayesian Computation Statistics
Iman Makhdoom; Shahram Yaghoobzadeh Shahrastani; FGhazalnaz Sharifonnasabi
Abstract
This study focuses on estimating the parameters of the Lindley distribution under a Type-II censoringscheme using Bayesian inference. Three estimation approaches—E-Bayesian, hierarchical Bayesian, andBayesian methods—are employed, with a focus on vague prior data. The accuracy of the estimates ...
Read More
This study focuses on estimating the parameters of the Lindley distribution under a Type-II censoringscheme using Bayesian inference. Three estimation approaches—E-Bayesian, hierarchical Bayesian, andBayesian methods—are employed, with a focus on vague prior data. The accuracy of the estimates isevaluated using the entropy loss function and the squared error loss function (SELF). We assess theefficiency of the proposed methods through Monte Carlo simulations, utilizing the Lindley approximationand the Markov Chain Monte Carlo (MCMC) technique. To demonstrate its practical applicability, weapply the methodology to a real-world dataset to analyze the performance of the methods in detail.Comparative results from the simulations and data analysis reveal the robustness and accuracy of theproposed approaches. This comprehensive evaluation underscores the advantages of Bayesian methods inparameter estimation under censoring schemes, providing valuable insights for applications in reliabilityanalysis and related fields. The study concludes with a summary of key findings, offering a foundation forfurther exploration of Bayesian techniques in censored data analysis.