14 April، 2022

Discussion of a higher diploma thesis in the College of Computer Science and Mathematics – Department of Statistics and Informatics

Discussion of a higher diploma thesis in the College of Computer Science and Mathematics – Department of Statistics and Informatics entitled (Detection of extreme values ​​in the linear regression model with application to well water pollution data on the outskirts of Mosul)A high diploma thesis on detection of extreme values ​​in a linear regression model with application to well water pollution data on the outskirts of Mosul was discussed on Thursday 4/14/2022 in the College of Science and Computer at the University of Mosul by the student Saja Marwan Ismail Al-Arkoub and under the supervision of the teacher Dr. Safwan Nazim Rashid.The study dealt with the research to identify the effect of outliers on the parameters of the multiple linear regression analysis model, where the outliers are detected and diagnosed and present in the data if they are in the independent variables or the dependent variable, which causes an impact on the estimation of the parameters of the studied model.. The extreme data types and methods of processing them were identified to obtain a better model with high efficiency or to reduce the impact of these values ​​on the model, and the MSE standard was set for the purpose of comparing the treatment methods and it was applied to real data taken from the Dams and Water Resources Research Center, University of Mosul. The results showed the priority of detecting abnormal values ​​using the box-plot method, as well as the preference of methods of treatment using the weighted hippocampal M estimator method, which is the best in detection among the methods that were used.The thesis aims to identify the multiple linear regression analysis model, the anomalies that the data are exposed to, the methods for detecting the abnormal values, knowing their types, and identifying the impact of these values ​​on the parameters of the model, treating them in different ways and methods, and making a comparison on the results of the analysis, estimating a multiple linear regression model on real data represented by pollution. Well water can be used to predict the values ​​of the dependent variable to get the most accurate results.The discussion committee was chaired by Assistant Professor Dr. Haifa Abdel-Gawad Saeed, with the membership of Assistant Professor Dr. Muzahim Muhammad Yahya, and under the supervision and membership of the teacher, Dr. Safwan Nazim Rashid.

Share

Share