python - Approach where features are combination of text(labels) and numerical -


i'm trying figure out approach data set includes text, more labels , numeric data. example, in data set, have city, state, lat/lon , want classify. supervised, have labels (y) data.

so in case, text not bag of words on or that. label, more 0, 1, ... however, don't ~think~ want give algorithm idea these real values. have tried couple of different algos including svm.svc , linearsvc, , decisiontree. svm, converted city , state numeric values using couple of different methods including labelencoder. doesn't seem right intuitively , not satisfied score.

any thoughts or input appreciated.

it looks looking onehotencoder. explanation take @ encoding categorical features section of docs. idea make column each city 0/1 values if sample belongs current city. might interested in dictvectorizer.


Comments

Popular posts from this blog

python - TypeError: start must be a integer -

c# - DevExpress RepositoryItemComboBox BackColor property ignored -

django - Creating multiple model instances in DRF3 -