python - Show label probability/confidence in NLTK -
i'm using maxent classifier python nltk library. dataset, have many possible labels, , expected, maxent returns 1 label. have trained dataset , 80% accuracy. i've tested model on unknown data items, , results good. however, given unknown input, want able print/display ranking of possible labels based on internal criteria maxent used select one, such confidence/probability. example, suppose had a,b,c
possible labels , use maxent.classify(input)
, 1 label, let's c
. however, want able view a (0.9), b(0.7), c(0.92)
, can see why c
selected, , possibly choose multiple labels based on parameters. apologies fuzzy terminology, i'm new nlp , machine learning.
solution
based on accepted answer, here's skeleton code example demonstrate wanted , how can achieved. more classifier examples on nltk website.
import nltk contents = read_data('mydataset.csv') data_set = [(feature_sets(input), label) (label, input) in contents] # user-defined feature_sets() function train_set, test_set = data_set[:1000], data_set[1000:] labels = [label (input, label) in train_set] maxent = nltk.maxentclassifier.train(train_set) maxent.classify(feature_sets(new_input)) # returns 1 label multi_label = maxent.prob_classify(feature_sets(new_input)) # returns dictionaryprobdist object label in labels: multi_label.prob(label)
try prob_classify(input)
it returns dictionary probability each label, see docs.
Comments
Post a Comment