Piwik: Breaking Away from Google Analytics

Many of us in the library community who have a responsibility to assess the usage of our library’s websites have become very familiar with the popular Google Analytics. Google Analytics is free and robust, and yet the data it collects belongs to Google and is housed on U.S. servers, where data may be subject to the legislation of that country.

While many may see this as inconsequential (hey, Canada.ca uses Google Analytics, why can’t we?), those of us in the library community who wish to uphold the longstanding tradition in our profession of protecting user privacy, may wish to seek other alternatives. The ALA LITA Patron Privacy Technologies Interest Group serves as an example of the librarian community’s increasing awareness and concern around patron privacy.

One viable alternative to Google Analytics is Piwik. Piwik is an open-source web analytics platform that allows for 100% data ownership and privacy protection.

Visitors, Actions, Referrers and Goals

So how does it stack up to GA? At first glance Piwik’s Dashboard is very familiar to that of Google Analytics. Visitors, Time on Pages, Software and Devices. Rather than bounce rates (which are rarely useful in the library context anyway), Piwik uses an Engagement metric which includes Visits per Duration and Visits per Number of Pages.

The Actions section includes the popular Page Visits, Entry Pages, Exit Pages, Page Titles and Site Search. It nicely highlights Downloads, and Outlinks. From a library context, this is a very valuable feature that Google Analytics does not provide. (Within our library we’ve purchased Crazy Egg to monitor click usage on external websites).

But is it as robust? You bet it is, but Google Analytics allows for easier cross-tabulation (segmentation and filters in Google Analytics language) without any programming skills. It is possible in Piwik, but requires more advanced skills to view the data using their Custom Variables.

One of the reasons for Google Analytics’ success has been the software installations’ ease of use. With the data being installed on their servers, the installation is as easy as copying and pasting a small piece of javascript code into your web pages.

Piwik on the other hand requires a bit more skill to install, but not much. Piwik has modeled its installation on the famous 5-minute WordPress installation, so if you’ve ever installed WordPress on a server, you’ll have no trouble installing Piwik.

Like Google Analytics, Piwik allows for multiple domain tracking, from separate servers and unlimited users and accounts.

Piwik Dashboard

Data Ownership

By having ownership of your data, Piwik allows you to have full control of data analysis. While Google Analytics provides an easy GUI to create filters and segment the data, Piwik allows you to manipulate the data without any limitations. In order to setup Custom Variables you’ll need some javascript knowledge, or access to someone who does. But once these are setup you can easily check the reports in your dashboard.

Data Privacy

Piwik goes to great lengths to protect visitor privacy. After installing the software, administrators can increase privacy by changing settings such as:

  • Automatically anonymizing visitors’ IPs
  • Inserting an analytics opt-out feature on your site using an iframe
  • Automatically deleting old visitor logs after a set period of time
  • Automatically deleting archived data (see more on auto-archiving below) after a set period of time.

Auto-archiving

For those administrators with command line skills, you can auto-archive your data with cronjobs in Linux/Unix, or you can use Windows Task Scheduler, or cPanel.

Suitability for Libraries

Much criticism of Google Analytics in libraries is that it is designed for e-commerce websites. Bounce Rates and Conversion Goals are just two examples of Google Analytics features that are not particularly useful for measuring success of library web pages and content.

All in all, Piwik is an excellent alternative to Google Analytics for libraries, particularly in Canada where visitor privacy is not the only concern, but also the implications of data being stored on U.S. servers. For those who have the means to install software on their web servers, Piwik is highly recommended. For library purposes, Piwik should meet the needs of website assessment, and also address privacy and data storage concerns.

Susanna Galbraith is the Virtual Services Librarian at McMaster University’s Health Sciences Library. She received her MLIS from the University of Western Ontario and prior to that a diploma in Web Design and Programming, and a BA in Anthropology from Concordia University. Her professional interests include web usability, user experience and marketing for libraries. You can find Susanna at @su_anna.