How does Roistr work?
Roistr uses some advanced techniques from the field of natural language processing to group similar items together.
We use methods such as latent semantic analysis, latent dirichlet allocation, wordnet and others that we can't talk about here.
Most of these methods need a background of information (somewhat like a map of semantics) to use as a guide. Roistr uses (amongst other things) a modified copy of Wikipedia (modified because the plain copy was producing poorer results) which is a very large knowledge base by most standards.
When Roistr receives a document, it converts it into a vector that is an n-dimensional representation of the document. These vectors can be manipulated and compared for similarity or dissimilarity and are analogous to the semantic meaning of a document. The degree of similarity (proximity) between 2 documents indicates the semantic closeness they have.
Can it be used for anything else?
It can and has been applied for things like matching CVs and job descriptions, for automatically grouping documents found on an enterprise server, and matching product descriptions with people's social media outputs.
The crucial thing is that is works at trying to understand what the meaning of a document is and uses this to group documents with similar meanings. The beauty of it is that it produces results similar to those produced by humans.
Can it be adapted for other languages?
Yes it can. Contact us for more information and a no obligation demonstration.