Description
C. H. C. Leung1, PhD, Y. X. Li2, PhD and L. J. Hao31School of Science and Engineering and GuangdongProvincial Key Laboratory of Future Networks of Intelligence, Chinese University of Hong Kong, Shenzhen, China2Department of Computer Science, Hong Kong Baptist University, Hong Kong, China3School of Data Science, Chinese University of Hong Kong, Shenzhen, ChinaPart of the book: Advances in Business and Management. Volume 20Chapter DOI: https://doi.org/10.52305/IRSN6620AbstractMany decision-making scenarios can be viewed as classification problems. Classification decisions are pervasive and occur in many business situations. In many applications, classification problems do not occur in individually but in groups where several classification problems need to be solved. Examples of these include student admissions at colleges, whether or not to extend job offers to applicants, the effectiveness of advertising channels, and determining if COVID patients should be hospitalized. With any form of classification, however, there are unavoidable inaccuracies arising in different forms, especially when multiple classification tasks need to be performed. Since typical classifiers are not free from errors, classification errors tend to accumulate, and having frequent misclassifications are often unacceptable. Moreover, in unsupervised learning situations, there are typically no pre-determined ground truth classes; in such a situation the ground truth class is determined by the view of the majority of classifiers. In this chapter, we examine the situation of multiple classifications within the Naïve Bayes framework, where the ground truth is determined by the decision of most classifiers, and where there are finite resources requiring decisions to be made within a limited budget. Here, we represent the classification tasks as a one-dimension random walk process and perform a probabilistic analysis of the situation. We find that by raising the budget, the probability of error in classification can be controlled, and the extent of the reduction can be quantified. These results can be beneficially deployed in a variety of business decision[1]making situations in measuring and enhancing the quality of decisions.Keywords: binary classification, naïve Bayes classifier, multiple classification, random walk, unsupervised learningReferences[1] Gao C., Y. Lu, and D. Zhou. 2016. “Exact Exponent in Optimal Rates forCrowdsourcing.” Paper presented at the Thirty-Third International Conference onMachine Learning, New York, June 20-22.[2] Berend D. and A. Kontorovich.2015. “A Finite Sample Analysis of the Naive BayesClassifier.” Journal of Machine Learning Research, 16(1):1519-1545.[3] Bishop C. M. 2006. Pattern Recognition and Machine Learning (Information Scienceand Statistics). Springer-Verlag.[4] Bonald T. and R. Combes. 2017. “A Minimax Optimal Algorithm forCrowdsourcing.” Paper presented at the Thirtieth International Conference on NeuralInformation Processing Systems, Long Beach, December, 4-7.[5] Chen X., Q. Lin, and D. Zhou. 2013. “Optimistic Knowledge Gradient Policy forOptimal Budget Allocation in Crowdsourcing.” In Proceedings of the ThirtiethInternational Conference on Machine Learning, Atlanta, June 17-19.[6] Feller W. 2008. Introduction to Probability Theory and Its Applications, Volume I, 3rd. Ed. Wiley.[7] Manino E., L. Tran-Thanh, and N. R. Jennings. 2019. “On the Efficiency of DataCollection for Multiple Naïve Bayes Classifiers.” Artificial Intelligence, 275: 356–378.[8] Manino E., L. Tran-Thanh, and N. R. Jennings. 2018. “On the Efficiency of DataCollection for Crowdsourced Classification.” In Proceedings of the Twenty-SeventhInternational Joint Conference on Artificial Intelligence, Stockholm, July 13-19.[9] Karger D. R., S. Oh, and D. Shah. 2014. “Budget-Optimal Task Allocation for ReliableCrowdsourcing Systems.” Operations Research, 62(1):1-24,[10] Khetan A. and S. Oh. “2016.Achieving Budget-Optimality with Adaptive Schemes inCrowdsourcing.” In Proceedings of the Twenty-Ninth International Conference onNeural Information Processing Systems, Barcelona, December 5-8.[11] Fawcett T. 2006. “An introduction to ROC analysis,” Pattern Recognition Letters, vol. 27, no. 8, 861–874.[12] Lewis D. D. and W. A. Gale.1994. “A Sequential Algorithm for Training TextClassifiers.” In Proceedings of the Seventeenth Annual International ACM SIGIRConference on Research and Development in Information Retrieval, Dublin, July 3-6.[13] Littlestone N. and M. K. Warmuth. 1994. The Weighted Majority Algorithm.Information and Computation, 108(2):212-261.[14] Liu Q., J. Peng, and A. Ihler. 2012. “Variational Inference for Crowdsourcing.” InProceedings of the Twenty-Fifth International Conference on Neural InformationProcessing Systems, Lake Tahoe. December 3-8.[15] Ho C.-J., S. Jabbari, and J. W. Vaughan. 2013. “Adaptive Task Assignment forCrowdsourced Classification.” In Proceedings of the Thirtieth InternationalConference on Machine Learning, Atlanta, June 16-21.[16] Kuang N. L. J., C. H. C. Leung. 2019. “Analysis of Evolutionary Behavior in Self Learning Media Search Engines.”In Proceedings of the IEEE InternationalConference on Big Data, Los Angeles, December 9-12.[17] Kuang N. L. J. and C. H. C. Leung. 2018. “Performance Dynamics and TerminationErrors in Reinforcement Learning – A Unifying Perspective.” In Proceedings of theIEEE International Conference on Artificial Intelligence and Knowledge Engineering(AIKE), Laguna Hills, September 26-28.[18] Parzen E. 2018. Stochastic Processes. Dover.[19] Oleson D., A. Sorokin, G. Laughlin, V. Hester, J. Le, and L. Biewald. 2011.“Programmatic Gold: Targeted and Scalable Quality Assurance in Crowdsourcing.”In Proceedings of the Twenty-Fifth AAAI conference on artificial intelligence, August 7-11.[20] Ramchurn S. D., F. Wu, W. Jiang, J. E. Fischer, S. Reece, S. Roberts, T. Rodden, C.Greenhalgh, and N. R. Jennings. 2016. “Human-agent collaboration for disasterresponse.” In Proceedings of the 15th International Conference on AutonomousAgents and Multiagent Systems, Singapore, May 9-13.[21] Simpson E. and S. Roberts. 2014. Bayesian Methods for Intelligent Task Assignmentin Crowdsourcing Systems. In Scalable Decision Making: Uncertainty, Imperfection,Deliberation, 1-32. Springer.[22] Snow R., B. O’Connor, D. Jurafsky, and A. Y. Ng.2008. “Cheap and Fast – but is ItGood Evaluating Non-expert Annotations for Natural Language Tasks.” InProceedings of the Conference on Empirical Methods in Natural LanguageProcessing, Waikiki, October 25-27.[23] Kuang N. L. J., C. H. C. Leung. 2019. “Leveraging Reinforcement LearningTechniques for Effective Policy Adoption and Validation,” in Misra S. et al. (eds) inComputational Science and Its Applications – ICCSA, Saint Petersburg, July, 1-4.[24] Kuang N. L. J., C. H. C. Leung.2019. “Performance Effectiveness of MultimediaInformation Search Using the Epsilon-Greedy Algorithm,” in Proceedings of the IEEEInternational Conference on Machine Learning and Applications, Boca Raton,Decemeber, 16-19.






