Professional Documents
Culture Documents
DECEMBER 2014
A ROBUST MULTIPLE IMPUTATION MODEL UNDER
NON-RANDOMLY MISSING DATA WITH OUTLIERS
By
Department Of Mathematics
University Of Lagos
December 2014
CERTIFICATION
This is to certify that the thesis:
Submitted to the
By
(899008084)
To God Almighty I gave all the glory and thanks for making it possible for me to successfully
complete this program. I will forever remain grateful for everything Lord.
I would like to express my profound gratitude to my supervisor, Professor Ray Okafor for his
guidance, thoroughness, constructive criticism, patience and support that made this work
possible and plausible. I sincerely appreciate the opportunity he gave me to work under him.
My profound gratitude also goes to my second supervisor, Dr Hamadu Dallah who took
interest and devoted much of his time to see to the success of this research work. I appreciate
the in-depth suggestions and comments that helped greatly in bringing the quality of this
work to this standard. I must sincerely say that it has really been a rewarding experience
I acknowledge with thanks the immense contributions made by all staff of the department of
Mathematics, University of Lagos for creating friendly and conducive environment that made
my stay in University of Lagos worthwhile. I cherish and will always keep the bond of
I wish to offer my sincere thanks and gratitude to my wife, Maryam Nicholas and children;
Daniel, Musa and Ibrahim for helping to carve out time necessary for understanding and
sacrifice throughout the period of this program. I appreciate you all for everything.
To all my outstanding great friends; Dr Johnson Olumuyiwa Agunsoye, Joshua Ali, Maria
Lorena Aguilera, Baba Salihu, Bukar Shuwa, Kama Mohammed, Fatoki Olayode,
B.A.Nkemnole, Akarawak, Ehige. It has really been more than friendship, I will forever
remain grateful for meeting such wonderful people like you in life.
I also owe a significant debt of gratitude to all those who helped to review the work, your
comments, suggestions and constructive criticism greatly helped to improve the quality of the
List of Tables
1 CHAPTER ONE
2 CHAPTER TWO
LITERATURE REVIEW
2.5 Outliers 24
3 CHAPTER THREE
METHODOLOGY
3.0 Introduction 28
3.8.4 Consistency of 58
3.13 Software 69
4 CHAPTER FOUR
4.1 Introduction 70
5 CHAPTER FIVE
6 References 105
Page
In all fields of knowledge, decision making and problem solving is essential and this critical
function needs accurate information obtained from reliable data. The issues of missing values
and outliers are of great concern as either ignoring or handling them inappropriately may
cause the forecasting models to describe neither the bulk of the data nor the outliers leading
to incorrect decision. The properties of multiple imputation in contaminated multivariate data
under non-ignorable missingness mechanism were investigated. The generalised S-estimator
(GSE) algorithm is utilized here for the robust estimation of location and scale parameters.
The behaviour of the proposed method was studied under different design conditions and
theoretically the proposed method was shown to be unbiased and consistent under elliptical
models. Monte Carlo simulation studies with four factors, that is; response rate,
contamination levels, missingness mechanism and sample size yielding 135 different design
conditions was evaluated and the findings support the theoretical results empirically. High
rate of missing values, sample size and contamination level was found to have significant
influence on the performance of the estimators. Shorter confidence intervals and robust
standard errors were achieved with the process converging rapidly. The simulation results
show that under missing not at random (MNAR) mechanism, the proposed method performed
better than when values are missing at random (MAR) or missing completely at random
(MNAR) in all sample sizes. Finally the paradigm was applied to two sets of multivariate
incomplete contaminated real life data on poverty and inequality, and Masakwa seedling
survival rate and the results obtained support the simulation findings.
UNIVERSITY OF LAGOS
SCHOOL OF POSTGRADUATE STUDIES
DEPARTMENT OF MATHEMATICS
Department Mathematics
Qualifications: B.Sc. (Hons) Statistics (Second Class Lower), A.B.U., Zaria, 1987
M.Sc. (Statistics), University of Lagos, 1990
1st SUPERVISOR
2nd SUPERVISOR
Name Dr. Hamadu Dallah
Designation Associate Professor
Department Actuarial Science
Proposed Title of Thesis: Robust Multiple Imputation under Non-randomly Missing Data
with Outliers
Recommendations
The Departmental Postgraduate Committee at its meeting held on August 28, 2014
considered the above application and recommends same to the Board of the School of
Postgraduate Studies for approval.
______________________________ _______________________
The candidates work is original. The candidate worked on a new Robust Multiple Imputation
method that is consistent and efficient in handling contaminated multivariate data under non-
ignorable missingness.
The candidate has demonstrated a good understanding and high level of competence in
handling the joint problem of outliers and missing values under non-ignorable missingness.
He has also demonstrated good knowledge and understanding of the development of R code
for simulation and application to practical problems .
The candidate has developed a Robust Multiple Imputation model that performs very well in
a contaminated data under non-ignorable missingness. The proposed method has shown to be
robust, consistent and efficient, and can be generalized to other more complex multivariate
models such as mixed models and generalized linear models.
Potential worth of the content of the research material for purpose of publication:
The content of the work done is publishable in both local and international journals. At least
three papers have been prepared and are under the process of publication.
The research work has three major contributions to knowledge which include; the proposed
Robust Multivariate Multiple Imputation is useful in enlarging the theory and application of
multiple imputation in handling missing data and outliers, the developed simulation R-code
can easily be generalized and applied to handle more complex multivariate models such as
generalised linear models and mixed linear models.
SECTION D
An assessment of progress in the research during the period, including any serious
delay or very rapid progress in the students work:
During the period of the candidates study, he has shown high level of competence and
exhibited great commitment and hard work in his research work. With the same tempo, he
should be able to finish the programme within few months from now.
Section E Particulars of Supervisors
1st SUPERVISOR
2nd SUPERVISOR
ATTESTATION
It is hereby confirmed that on August 28, 2014, DIBAL, Nicholas Pindar with matriculation
number 899008084 successfully defended before the Departments Postgraduate Committee,
the Ph.D. proposal titled Robust Multiple Imputation under Non-randomly Missing Data
with Outliers.
The Departmental Postgraduate Committee at its meeting of August 28, 2014 considered the
above application and recommends same to the Board of the School of Postgraduate Studies
for approval.
_____________________________ __________________________
Professor S. O. Ajala Dr. O. J. Fenuga
Head of Department Departmental PG Coordinator
_____________________________ ___________________________
Professor R. Okafor Professor S. A. Okunuga
____________________________ ___________________________
Professor J. O. Olaleru Professor S. S.Okoya
____________________________ ___________________________
Dr. I. O. Abiala Mr A. Adeniyan
_____________________________ ___________________________
Dr M. O. Adamu Dr H. Akewe
Abstract
In all fields of knowledge, decision making and problem solving is essential and this critical
function needs accurate information obtained from reliable data. The issues of missing values
and outliers are of great concern as either ignoring or handling them inappropriately may
cause the forecasting models to describe neither the bulk of the data nor the outliers leading
to incorrect decision. The properties of multiple imputation in contaminated multivariate data
under non-ignorable missingness mechanism were investigated. The generalised S-estimator
(GSE) algorithm is utilized here for the robust estimation of location and scale parameters.
The behaviour of the proposed method was studied under different design conditions and
theoretically the proposed method was shown to be unbiased and consistent under elliptical
models. Monte Carlo simulation studies with four factors, that is; response rate,
contamination levels, missingness mechanism and sample size yielding 135 different design
conditions was evaluated and the findings support the theoretical results empirically. High
rate of missing values, sample size and contamination level was found to have significant
influence on the performance of the estimators. Shorter confidence intervals and robust
standard errors were achieved with the process converging rapidly. The simulation results
show that under missing not at random (MNAR) mechanism, the proposed method performed
better than when values are missing at random (MAR) or missing completely at random
(MNAR) in all sample sizes. Finally the paradigm was applied to two sets of multivariate
incomplete contaminated real life data on poverty and inequality, and Masakwa seedling
survival rate and the results obtained support the simulation findings.
LIST OF CORRECTIONS MADE
Page
1 Recast Abstract
Yakubu Bukar (blessed memory) Danjuma Bako Mshelia, Simon Malgwi, Prof.
Bello Mshelia, Mr Isah Jonah , Bitrus Salihu Hena, Timothy A. Mbaya , Charity
Haruna, Dolapo Aramide, Ifeoma Avemaria Obasi, Diana,
_______________________ __________________________
Prof. S. O. Ajala Dr O. J. Fenuga
Head, Department of Mathematics Chairman, Departmental Postgraduate committee
UNIVERSITY OF LAGOS
SCHOOL OF POSTGRADUATE STUDIES
DEPARTMENT OF MATHEMATICS
RECOMMENDATION FOR APPROVAL OF PANEL OF EXAMINERS
Section A: PARTICULARS OF CANDIDATE
Name: DIBAL Nicholas Pindar
MATRIC NO: 899008084
QUALIFICATIONS: B. Sc. (Hons) Statistics (A. B. U.,Zaria). 22, 1987
M.Sc. Statistics (UNILAG), 1990
DEGREE IN VIEW: Ph.D. Statistics
DATE OF FIRST REGISTRATION: 15th January, 2010
STATUS: Full Time
APPROVED TITLE OF THESIS: A Robust Multiple Imputation Model Under
Non-Randomly Missing Data with Outliers
FIELD OF STUDY: Sample Survey
DATE OF APPROVAL: 24th December, 2014
SUPERVISORS: 1) Prof. R. Okafor
2) Dr H. Dallah (Associate Professor)
Section B: PROPOSED PANEL OF EXAMINERS:
1 EXTERNAL EXAMINER: Prof.
Department of Mathematics
University of
Tel: 080
e-mail:
2 INTERNAL EXAMINER 1: Prof.
Department of Mathematics
University of
Tel:
e-mail:
3 INTERNAL EXAMINER 2: Prof.
Department of Mathematics
University of
Tel:
e-mail:
POSTGRADUATE REPRESENTATIVE: Prof
Department of
University of Lagos, Akoka
Tel: 080
e-mail:
RECOMMENDATION
The Departmental Postgraduate Committee at its meeting held on December, 2014 considered the above
application and recommends same to the Board of Postgraduate Studies for approval.
_________________________ _____________________________________
Prof S. O. Ajala Dr O. J. Fenuga
Head, Department of Mathematics Chairman, Departmental Postgraduate Committee
UNIVERSITY OF LAGOS
SCHOOL OF POSTGRADUATE STUDIES
DEPARTMENT OF MATHEMATICS
INTERNAL : 1) Prof.
Department of
University of Lagos, Akoka
2) Prof
Department of
University of Lagos, Akoka
RECOMMENDATION
We recommend that the Thesis be accepted and the degree of Doctor of Philosophy (Ph.D.) in Statistics be
awarded to the candidate DIBAL Nicholas Pindar subject to item (vi) above.
SIGNATURES
Prof. Dr Adeleke Ismail
INTERNAL EXAMINER INTERNAL EXAMINER
Prof.
EXTERNAL EXAMINER
01/01/2015 Dr Dr
Date P.G. Representative Chairman of Panel
Certification
We certify that the thesis has been corrected in accordance with the comments of the examiners to our
satisfaction. We therefore recommend that the degree of Doctor of Philosophy (Ph.D.) in Statistics be awarded
to the candidate DIBAL Nicholas Pindar
Signatures
Prof. Dr.
INTERNAL EXAMINER INTERNAL EXAMINER
07/01/2015
DEPARTMENTAL RECOMMENDATION:
The departmental postgraduate committee at its meeting of 10/01/2015 considered and recommend the above
Ph.D. Examiners Report to the Board of the School of Postgraduate Studies for approval.