statistics - last observation carried forward hot deck imputation -


i working on college assignment of data mining , knowledge discovery.

the question follow:

fill in holes using “last observation carried forward hot deck imputation”

temp    spots   age   diagnosis
103.0    yes     6    measles
94.7     yes     1   not measles
100.1    yes     -2    measles
102.0    no    20   not measles
96.5     no     ?   not measles
97.2     yes     30   not measles
104.5    no     2400   not measles
101.9    ?     7   measles
?      yes     8   measles
99.8     yes     4   measles

i searched on internet , found locf , hot deck imputation 2 different methods deal missing data. question asking combination of both.

is there special case fill data using both methods.

this found on wikipedia:

one form of hot-deck imputation called "last observation carried forward", involves sorting dataset according of number of variables, creating ordered data set. technique finds first missing value , uses cell value prior data missing impute missing value. process repeated next cell missing value until missing values have been imputed. in common scenario in cases repeated measurements of variable person or other entity, represents belief if measurement missing, best guess hasn't changed last time measured.

i did not much. how work spot attribute has value yes or no.

basically, method goes when attribute missing, sort other rows, , carry forward last 1 has attribute. so, if we're missing

96.5 no ? not measles

then sort (possibly using arbitrary decisions, e.g., temperature more imporant spots, , "no" < "yes), , 1 before

94.7 yes 1 not measles

(note different decision on ordering yield different result). fill age 1.

etc.


Comments

Popular posts from this blog

python - TypeError: start must be a integer -

c# - DevExpress RepositoryItemComboBox BackColor property ignored -

django - Creating multiple model instances in DRF3 -