statistics - last observation carried forward hot deck imputation -
i working on college assignment of data mining , knowledge discovery.
the question follow:
fill in holes using “last observation carried forward hot deck imputation”
temp spots age diagnosis
103.0 yes 6 measles
94.7 yes 1 not measles
100.1 yes -2 measles
102.0 no 20 not measles
96.5 no ? not measles
97.2 yes 30 not measles
104.5 no 2400 not measles
101.9 ? 7 measles
? yes 8 measles
99.8 yes 4 measles
i searched on internet , found locf , hot deck imputation 2 different methods deal missing data. question asking combination of both.
is there special case fill data using both methods.
this found on wikipedia:
one form of hot-deck imputation called "last observation carried forward", involves sorting dataset according of number of variables, creating ordered data set. technique finds first missing value , uses cell value prior data missing impute missing value. process repeated next cell missing value until missing values have been imputed. in common scenario in cases repeated measurements of variable person or other entity, represents belief if measurement missing, best guess hasn't changed last time measured.
i did not much. how work spot attribute has value yes or no.
basically, method goes when attribute missing, sort other rows, , carry forward last 1 has attribute. so, if we're missing
96.5 no ? not measles
then sort (possibly using arbitrary decisions, e.g., temperature more imporant spots, , "no" < "yes), , 1 before
94.7 yes 1 not measles
(note different decision on ordering yield different result). fill age 1.
etc.
Comments
Post a Comment