python - Fuzzy-match List of People -
i trying see if movie same between 2 pages, , compare actors 1 of criteria. however, actors listed differently on different pages. example:
on page, https://play.google.com/store/movies/details?id=csdcb2koh74, actors listed "mikhail galustyan, danny trejo, guillermo díaz, oleg taktarov, kym whitley, christopher robin miller, robert bear, vladimir yaglych, josh mclerran"
one page, http://www.imdb.com/title/tt2167970/, actors "ivan stebunov, ingrid olerinskaya, vladimir yaglych"
previously, doing rough match on:
if actors_from_site_1[0] == actors_from_site_2[0]
but, can see above case, isn't technique. better technique see if actors 1 film match others?
you check length of set intersection of 2 sets of actors.
if len(set(actors_from_site_1).intersection(set(actors_from_site_2))):
or like:
if any(actor in actors_from_site_1 actor in actors_from_site_2):
Comments
Post a Comment