Textflo - Analyse Book Text Example


This page gives one example of using the application to analyse a list of words that have been created from a whole book of text. It shows how the book contents can be filtered and/or formatted to make it suitable for an analysis. The book, as a text file, loaded in under 1 minute and was analysed in around 5 minutes. Note that word stemming parses each word and tries to combine words from the same trunk but with different suffixes, etc.



It is possible to load reasonably large documents into the application. One feature is counting popular words or word sequences. In this example, the text content of a book has been loaded and analysed in this way. The result could, for example, be useful for generating the book's index of terms. In the first figure, the main text content of the book has been loaded into the application. The DBS manual grid can then be used to remove certain sections - the contents or the references, for example.





The text can then be analysed, with additional filtering if required. In this example, the linear counts algorithm is used, with the most popular words shown in the figure below. Common words, from a default provided list (and, the, a, etc.) are removed first and words are also stemmed (service, services, etc.) You could then possibly use this as a guide for the terms to put into your book index, although this can also be deceptive (no artificial intelligence or autonomous here, for example).