python - sklearn.linear_model.RandomizedLogisticRegression : Handle Categorical Value -
i want use randomizedlogisticregression
selecting variable data set. problem that, 1 of feature in data set gender. it's values 'f' or 'm' instead of numerical values. result getting folllowing error:
traceback (most recent call last): file "main.py", line 84, in customer_acquisition_binary_logistics self.randomized_logistic_regression() file "main.py", line 92, in randomized_logistic_regression randomized_logistic.fit(x,y) file "c:\python27\lib\site-packages\sklearn\linear_model\randomized_l1.py", line 91, in fit x = as_float_array(x, copy=false) file "c:\python27\lib\site-packages\sklearn\utils\validation.py", line 112, in as_float_array return x.astype(np.float32 if x.dtype == np.int32 else np.float64) valueerror: not convert string float: f
how can handle categorical value not numeric? thank you.
you have encode str values numeric values, use labelencoder
:
in [33]: sklearn import preprocessing le = preprocessing.labelencoder() print(le.fit(["paris", "paris", "tokyo", "amsterdam"])) print(list(le.classes_)) print(le.transform(["tokyo", "tokyo", "paris"]) ) print(list(le.inverse_transform([2, 2, 1]))) labelencoder() ['amsterdam', 'paris', 'tokyo'] [2 2 1] ['tokyo', 'tokyo', 'paris']
Comments
Post a Comment