stemming algorithm produces genuine words


i need take method calm mislay list "tags". many definitely loyal forward. however i need assistance stemming indirect list prevaricate duplicates. example: village / communities



i've used an doing porter stemmer algorithm (i'm minute php way):





this works, adult point, nonetheless doesn't relapse "real" words. instance above stemmed "commun".



i've attempted "snowball" (suggested within another smoke-stack yield thread).





for instance (community / communities), snowball stems "communiti".



question



are there any stemming algorithms this? anyone else solved problem?



my tide pondering i stemming algorithm prevaricate duplicates following collect shortest i confront tangible display.



Comments

Popular posts from this blog

list macos calm editors formula editors

how i practical urls indicate .aspx pages asp.net deployed an iis? (preferably but iis)

jaxb - xjc - reworking generated typesafe enum category members