An ideal machine learning implementation would also need the context, such as the original post itself, parent comment(s), other comments in the thread, etc.
It can be quite difficult than one might think. For example, now that we are talking about spam, the word "Viagra" shouldn't block my comment, even though my parent post doesn't mention the word or in a situation where nobody else mentioned it.
It can be quite difficult than one might think. For example, now that we are talking about spam, the word "Viagra" shouldn't block my comment, even though my parent post doesn't mention the word or in a situation where nobody else mentioned it.