MALLET is a Java-based package for statistical natural language processing, document classification, clustering, topic modeling, information extraction, and other machine learning applications to text.
There is a graphical user interface for MALLET that makes it really easy to use. It allows you to set some basic options, such as location of input and output files, number of topics, number of topic words, number of iterations, stopword list. Also, you can browse the results by topic (with option to see which documents are associated with any one topic) or by document (with option to see the proportion of each topic in the document). This is a great and user-friendly addition to the powerful MALLET topic modeling engine.
by Elijah Meeks, 18 February 2011
by Cameron Blevins, 1 April 2010
by Elijah Meeks, 19 February 2011
by Shawn Graham, 30 August 2011
This package allows users to train topic models in MALLET and load results directly into R.
Bamboo DiRT is a registry of digital research tools for scholarly use. (more)