This page documents a previous project that was done in the early years of my schooling. The reason for posting this on the blog is due to an ongoing effort to modernize the site.

ProjectAnansi was a multi project system that involved ingesting ebooks, organizing, visualizing, and recommending ebooks. This was a personal project needed to solve a problem of an unorganized collection of ebooks.


This system is a collection of programs that assist the user in organizing, automatically cataloging, and recommending eBooks.

  • Languages: Mono [ARTMAP implementation], Java [Interface, and Data processor]
  • Other software and frameworks: MySQL, JUNG graph visualizer, jconfig4net, PDFBox, CVS, SVN
  • Creation Date: ~2007
  • Platform Target: Any platform that can run the Mono and Java runtimes on a desktop environment.
  • Methodologies: Inverse Indexes, ARTMAP, MVC, Design patterns, Knowledge Management, Knowledge Base, Natural Language Processing, Machine Learning, Search, Hierarchical Data, clustering, Recommendation systems
  • Purpose: Personal Project
  • Alternative Software: As far as I know, this is the only software project of its kind. There are a few other software programs that keep a collection of eBooks, and keep a small amount of metadata per each book. However, this is the only software that attempts to organize the books.

ProjectAnansi is a system of subprojects whose goal is to catalog, organize, and recommend related text [PDFs and plain text]. ProjectAnansi has three subprojects, with the project names coming from mythological gods of knowledge and literature.

ProjectThoth is responsible for data management, and its responsibility is to take in a directory of books and output all the words and related information into a specified database. The biggest hurdle with this project is that the data organization contains thousands of words per book, and a directory tends to contain the equivalent of nearly a thousand books. This is a lot of information to load into a database, down to a granular level of 2-gram and single word information.

ProjectSaraswati is responsible for taking the formatted data and to extract meaning from it. This project turns the data into information by adding context. ProjectSaraswati is the only application that uses C#/Mono. ProjectSaraswati performs its task by using the words as a feature and running ARTMAP on them to get clusters of related content.

ProjectAnansi is responsible for interfacing with the user. This is the only application that the user may interact with and use. The application displays the graph of books and summaries of the clusters. This also allows for the user to open the desired eBook in the related application, and then later collect thoughts on the note feature.

For the main project core, there are three projects. This last project, ProjectImhotep, is to create an installation application that does all the setup. This would allow a technically challenged user to be able to install the application, the dependencies (MySQL, etc), and start to use the application.