python - sklearn.linear_model.RandomizedLogisticRegression : Handle Categorical Value -


i want use randomizedlogisticregression selecting variable data set. problem that, 1 of feature in data set gender. it's values 'f' or 'm' instead of numerical values. result getting folllowing error:

traceback (most recent call last):   file "main.py", line 84, in customer_acquisition_binary_logistics     self.randomized_logistic_regression()   file "main.py", line 92, in randomized_logistic_regression randomized_logistic.fit(x,y)   file "c:\python27\lib\site-packages\sklearn\linear_model\randomized_l1.py", line 91, in fit     x = as_float_array(x, copy=false)   file "c:\python27\lib\site-packages\sklearn\utils\validation.py", line 112, in as_float_array     return x.astype(np.float32 if x.dtype == np.int32 else np.float64) valueerror: not convert string float: f 

how can handle categorical value not numeric? thank you.

you have encode str values numeric values, use labelencoder:

in [33]:  sklearn import preprocessing le = preprocessing.labelencoder() print(le.fit(["paris", "paris", "tokyo", "amsterdam"])) ​ print(list(le.classes_)) ​ print(le.transform(["tokyo", "tokyo", "paris"]) ) ​ print(list(le.inverse_transform([2, 2, 1]))) ​ labelencoder() ['amsterdam', 'paris', 'tokyo'] [2 2 1] ['tokyo', 'tokyo', 'paris'] 

Comments

Popular posts from this blog

javascript - AngularJS custom datepicker directive -

javascript - jQuery date picker - Disable dates after the selection from the first date picker -