For personal/private [non-work related] projects, I tend to shy away from creating/working on open source projects. Typically open source projects tend to resemble the same work I do during the day, working and dealing with others. That’s no fun when you just want to build something you need. However, I am trying something new. I’m releasing one of the personal projects into the world of open source. It is the component that is used on this website for making recommendations for projects. [Example: see the bottom of the Financial Strategy Simulator page] A project page that contains more technical information about this project can be found under /a/Projects.

To find this source pull it [using git] from: https://github.com/monksy/PageRecommender

What is this project about? This project is designed to analyze Apache request logs, and attempt to piece together sessions, and then to create an Amazon/Newegg-like statistical recommendation. The desired output is XML representing a parent/child relationship of a page and the next connecting page.  The output comes from standard out. The component is designed to be used as a quiet utility.

What this project isn’t: This project isn’t a completely generic solution that’ll fit your site. It is designed within the context of my current website, and the format of the standard Apache log files. Want to have it look for pages that don’t fall under the /p/{Name} syntax? Change up the Apache log file format? Well this project won’t work for you without modification. Also, there is no such warranty provided by this code. It’s open source, it’s free as in speech but not as in beer.

What could be improved: I realize that this project isn’t perfect. I could have designed it to be slightly easier to read. It could be documented much, much better. However, this is an internal project. Want to improve it? Github should allow you to make such changes. [I’m rather new to Github, so don’t hold me to that statement] The XStream dependency could be removed. But for right now it works.

What do I recommend?

Grab the source! Add more tests, and send me input on how you think it could be improved.