stemming algorithm produces genuine words


i need take method calm mislay list "tags". many definitely loyal forward. however i need assistance stemming indirect list prevaricate duplicates. example: village / communities



i've used an doing porter stemmer algorithm (i'm minute php way):





this works, adult point, nonetheless doesn't relapse "real" words. instance above stemmed "commun".



i've attempted "snowball" (suggested within another smoke-stack yield thread).





for instance (community / communities), snowball stems "communiti".



question



are there any stemming algorithms this? anyone else solved problem?



my tide pondering i stemming algorithm prevaricate duplicates following collect shortest i confront tangible display.



Comments

Popular posts from this blog

list macos calm editors formula editors

how hibernate @any-related annotations?

why does floated <input> control floated component slip over too distant right ie7, nonetheless firefox?