Thereore, we changed the traditional Vector space model to without stemming and without eliminating the stopwords. |
|
In this particular case, the lexical units considered as encoding elements are words inside a noun-phrase, without taking into account stopwords. |
|
The given documents are pre-processed with the removal of stopwords. |
|
We then tokenize this text, lower-casing and normalizing characters onto an ASCII representation, filtering for stopwords and weigh the terms using TF-IDF weights. |
|