Open source mail archiving software compared
Locked Away
By law, enterprises are required to retain email for a certain period of time. The archiving solutions discussed here, Piler, Benno MailArchiv, and MailArchiva, promise both legally compliant storage and added benefits for corporations.
Many countries provide a comprehensive system of legal regulations for governing the delivery and storage of email. For instance, some mail might contain commercial correspondence or tax-relevant information that forms an official legal record and must stay around for a predetermined time for auditing purposes.
The auditability requirement often dictates that the archived email be immutable and thus protected against access by staff – even the omnipotent administrator. The laws governing email pose several technical problems: Many messages are stashed away in local mailboxes of mail clients where they are stored in an unstructured way so that pertinent messages cannot easily be distinguished from other mail. Targeted searching thus involves a large amount of manual work. The requirements for digital auditing, and possibly also internal compliance policies, are thus not met. Additionally, the daily flood of spam that swamps mailboxes makes it very difficult to sort the digital chaff from genuine content and hugely bloats the data volumes.
More Than Compliance
State-of-the-art archiving solutions address these tasks and problems with a varying degree of internal overhead. They take care of automated, permanent email storage, make the content easy and quick to find through centralized search solutions, and ensure auditable storage of the data. Ideally, these solutions will integrate seamlessly with the enterprise network, collaborate nicely with all popular mail servers, offer web-based access, provide granular authorization, and be able to store any kind of data transparently on any popular kind of storage medium, or even on specialized archiving systems. Ideally, too, the operator will have a choice of infrastructure between on-premises or cloud storage.
A positive side effect of such a solution is the reduced storage space requirement, because mail can be compressed and deduplicated. Additionally, it will be beneficial for business continuity, because mail remains in the archive even if mail servers fail or mail data is lost. Full-text searching against mail content and in mail attachments also makes it possible to find mail content years after archiving a message. See the "Legal Framework Example" box for more information.
Powerful Open Source Applications
In this article, I examine three standalone mail archiving products from the open source camp: Piler, Benno, and MailArchiva; Table 1 compares the features of these archiving tools.
Tabelle 1: Overview of the Test Candidates
Features |
Benno MailArchiv |
MailArchiva |
Piler |
---|---|---|---|
Test version |
2.1.0 |
4 |
1.1.0 |
Variant |
Community Edition, Commercial Version |
Open Source Edition, Enterprise Edition |
Open Source Edition |
SaaS model |
Via partners, Hosting Edition |
Cloud Edition |
No, but multitenant capable |
Operating systems |
Debian, Ubuntu, SLES, RHEL, UCS |
Windows, Linux, Solaris, BSD, OS X |
Linux, Solaris |
License |
GPL1 |
GPLv2 |
GPL |
Mail server |
Postfix, Exim, Sendmail, Qmail |
Yes |
Yes, all SMTP |
Microsoft Exchange |
2003/2007/2010 |
5.5/2000/2003/2007/2010/2013 |
2003/2007/2010/2013 |
Google apps |
No |
Yes2 |
Yes |
Others |
Zarafa, Open-Xchange |
Lotus Notes, Kerio, CommuniGate Pro, Scalix |
Lotus Notes, Zimbra, Office 365 |
Archiving |
|||
Mail standards |
POP3, IMAP, SMTP, Maildir, Milter |
POP3, IMAP, SMTP, Maildir, Milter |
POP3, IMAP, SMTP, Maildir, Milter |
Archiving rules |
No |
Yes |
Yes |
Retention rules |
No |
Yes2 |
Yes |
Encryption |
No |
AES-256 |
Blowfish |
Demonstrable immutability |
Checksums and log |
Signature,2 log signature,2 and log |
Signature and log |
Compression |
Yes, bzip |
Yes, zip |
Yes, Zlib |
Import |
POP3, IMAP, Maildir |
Maildir, PST, EML, MSG, Exchange, Google, Office 365 |
EML, Mailbox, PST |
Export |
EML |
EML, PDF2 |
EML |
Clustering search |
No |
Yes2 |
No |
Multitenanting |
Hosting Edition |
Yes2 |
No |
Deduplication |
Yes, email and attachments |
Yes, email and attachments2 |
Yes, email and attachments |
CLI |
Yes |
Yes2 |
Yes |
Client/Search |
|||
Web client |
Yes, Ajax |
Yes, Ajax |
Yes, responsive |
Full-text search |
Yes |
Yes |
Yes |
Multilingual search |
Yes |
Yes |
Yes |
Forwarding |
Yes |
Yes |
Yes |
Search in attachments |
Word, PPT, Excel, PDF, RTF, OpenOffice, zip, gzip, bzip2, tar, cpio, ar, JPEG metadata, Flash, mp3 |
Word, PPT, Excel, PDF, RTF, ZIP, tar, gz, OpenOffice |
Word, PPT, Excel, PDF, RTF, ZIP, OpenOffice |
Permissions |
Yes |
Yes2 |
Yes |
Auditing |
Yes |
Yes |
Yes |
Integration/Adaptation |
|||
Authentication Web GUI |
LDAP, MS AD, Univention Corporate Server (UCS), Novell eDirectory |
LDAP, MS AD, NTLM, Google, iMail |
LDAP, MS AD, Google, NTLM |
Storage |
Filesystem |
Filesystem |
Filesystem |
Localization |
German |
German, English, Portuguese, Czech, Chinese, Greek, French, Dutch, Russian, Japanese, Korean, Thai |
German, English, French, Spanish, Hungarian, Portuguese, Russian |
APIs |
REST, XML, Web service API with JSON support |
Web services |
No |
Antivirus scanner |
No |
Yes: ClamAV |
Yes: ClamAV |
Backup |
No |
Yes1 |
Yes |
Themes/skins |
No |
Yes1 |
Yes |
Price |
|||
Licenses |
EUR80 per year incl. five mailboxes (Small Business Edition); EUR12.50 per mailbox a year for 20 mailboxes (Standard Edition) |
Free up to 20 mailboxes; EUR23 per mailbox one-off, at least 25 mailboxes must be licensed. |
Free |
Support |
Software maintenance in first year, free, can be purchased separately for additional years |
20 percent of license costs per year |
Not available |
1 Community Edition only. 2 Commercial versions only. |
All of these products have a fundamentally similar approach: Email is either actively transmitted to the archive (using SMTP) or passively polled by the archive system, that is, retrieved from the mail or groupware server – typically using POP3 or IMAP calls to a journaling mailbox. As you can see in Figure 1, the messages are permanently stored either on the archive filesystem or in the database, including attachments.
All systems can be managed using a web client and support audits. The tools come with rights management features, including optional directory integration. The systems listed here all claim to be audit-proof and legally compliant. Of course, any technical solution must be accompanied by appropriate organizational policies to guarantee it as a compliant and comprehensive solution. Also, different jurisdictions have different rules for mail archiving. You should familiarize yourself with laws for your own country: Don't depend on the software to know your legal requirements.
Piler
Piler [1] is completely open source software from Hungary; its feature scope has grown immensely in the past two years so that it can now be regarded as a complete solution for mail archiving.
Email can be retrieved from SMTP servers by a variety of manufacturers, including Microsoft Exchange, and imported from different formats. The data is encrypted using the Blowfish algorithm and stored on the filesystem as compressed files. The matching metadata is stored in a MySQL database. The duplication rules are applied to both messages and attachments. Searching is handled by a Sphinx search engine.
The software takes legal requirements into consideration for the most part: Auditing options are in place, as is logging throughout. When saved, email is digitally signed to be able to check for manipulation or demonstrate a lack of it. Piler can connect to a large number of mail servers, including Lotus Notes, Zimbra, Google Apps, and Office 365. Authentication can be handled by LDAP or Active Directory or controlled by an IMAP server.
Archiving Rules
Administrators can handle the configuration in a web GUI, which is optimized for mobile devices. Additionally, CLI commands are available for automation purposes. Piler fundamentally transfers all mail from the data stream to the archive; administrators can define flexible rules based on regular expressions to filter messages with specific features and prevent them from ending up in the archive. Retention rules let the operator define how long messages are kept in the archive before they are automatically discarded (deleted).
Access to Email
Authorizations or searching and access to mail via the web GUI can be assigned at the user and group level (Figure 2). Regular users only get to see their own messages; if desired, the archive can be integrated directly with Outlook.
Auditors have access to all mail. Using a separate window, an auditor can access messages in a targeted way and retrieve a log for each email message, providing information on any operations performed on the message in question (e.g., whether the message has been accessed, searched for, or downloaded) (Figure 3). The search is fast and, in addition to an advanced search feature based on a form, offers the option of using search expressions with a highly detailed syntax:
size:>.2M, subject: viagra OR cialis, \ body: order < now, from: my@email.address
This is an example of a complex Piler search that filters out Viagra spam with a message size of more than 200KB and other features.
Installation
If you do not want to download and use the prebuilt VMware appliance [2], you will need a little patience installing Piler because the program, which was programmed in C, does not provide any installation packages. On a Linux or Solaris host, you first need to set up the required basic packages: OpenSSL, MySQL 5.1+, Sphinx Search 2.1+, PHP 5.3.x+, web server with rewrite technology (Apache, Lighthttpd, Nginx), TRE Regex Library, Libzip, and Iconv. Then, you can download the source code [3] and build it as follows:
tar zxvf piler-x.y.z.tar.gz cd piler-x.y.z ./configure --localstatedir=/var --with-database=mysql \ --enable-starttls --enable-tcpwrappers make su -c 'make install'
After doing so, set up a user named piler and run the postinstallation routine by typing make postinstall
; among other things, this will create the databases, generate cronjobs, and create a web directory. Finally, start the Piler daemon and the Sphinx indexer. Initial login via the web GUI uses the admin@local account and the pilerrocks password.
Once you get there, you can start setting up Piler for production. This involves creating users and groups, defining the desired archiving rules, and configuring the required SMTP server so that it passes the incoming mail data stream to Piler. If you have a Postfix mail server, you can do this with the following entry in main.cf
:
always_bcc = archive@piler.my.domain
Conclusions
Piler leaves users with an impression that the programmers have done their homework; the project is well maintained and comes with competitive documentation (Table 2). The integration options support operation in many environments. If the convoluted setup does not bother you, and you do not need better support, Piler will give you a comprehensive and lean system that meets most of the central requirements for mail archiving.
Tabelle 2: Piler
Manufacturer |
Piler (http://www.mailpiler.org) |
Price |
Free |
Technical data |
|
Verdict (max. 10 points) |
|
---|---|
Installation overhead |
2 |
Feature scope |
8 |
User-friendliness |
7 |
Integration options |
7 |
Documentation |
5 |
Overall rating |
5.8 |
Before using Piler in heavy load situations with many users, it makes sense to perform a proof of concept, including appropriate load and stability tests.
Benno MailArchiv
Following the general trend, Benno is available as the free Community Edition Open Benno MailArchiv [4] and as the commercially licensed Benno MailArchiv [5]. The two versions are fortunately compatible in terms of data. Benno was created in Germany and is primarily intended for a German audience, but it provides an optional English user interface. The community edition does not advertise official support for legally compliant archiving, although the commercial version does support the GDPdU German guidelines. Vendor support, software maintenance, and accompanying services are only available for the commercial variant.
Complete Package for Archiving
The vendor, LWsystems, presents Benno as a complete package that propagates open standards. Popular standards such as SMTP, POP3, and IMAP are support for collecting email, so integration with any well-known mail server is possible. Existing or legacy mail collections can be imported directly using the Maildir format, for example. Benno organizes mail data in containers directly on the filesystem. The stored files are safeguarded by checksums and interlinked. Administrators can specify the breakdown of archive containers by year, domain, or other criteria. Data encryption is not intended; however, the vendor points to the option of using an encrypted filesystem.
Core Feature
Searching is the application's core feature. Benno creates a full-text index of messages and attachments. The program relies on the Lucene and Tika search engines for searching so that attachments in various formats – from PDF, MS Office, OpenOffice/LibreOffice to ZIP archives – can be searched quickly. Plugins also provide an option for provisioning content. Benno shows users and auditors a neat web interface that supports convenient email searches, including the option to download or forward (Figure 4).
User authorization management for the web GUI relies on a local (integrated) database or, alternatively, on Microsoft Active Directory via a matching connector or LDAP server.
Installation from Distribution Packages
Installing Benno is very easy, given that the system does not pose any major technical requirements: A Java JDK 6 runtime must be in place for the archiving back end. To run the Benno MailArchiv front ends, you need PHP5, Smarty templates, and an Apache2 web server.
You can download the required packages from the Benno repository. For Ubuntu, Debian, or UCS, just add the package source and GPG key in your package manager and run the following commands:
apt-get update apt-get install benno-lib benno-core \ benno-archive benno-rest-lib benno-rest apt-get install apache2 php5 php-pear \ php-db smarty apt-get install benno-web
Before launching the Benno services, you first need to add a shared secret to the /etc/benno/benno.xml
and /etc/benno-web/benno.conf
files. This is used to safeguard server communication between Benno Core and the REST API.
Then, copy the license file to /etc/benno/benno.lic
and restart the Benno REST service by typing /etc/init.d/benno-rest restart
. If you want to set up the free open source edition, you need to create an empty benno.lic
file and set the USERPERMISSONS = DISABLED
parameter in the /etc/benno-web/benno.conf
file. Access to the web interface is via the URL http://bennoserver/benno using the admin/secret account password combination. You can then create users with the benno-useradmin
command-line tool, if you are not using a centralized directory service.
Conclusions
Benno is a complete system based on open software in combination with optional vendor support (Table 3). Additionally, a hosting edition is available that targets service providers and offers a flexible billing model. Smart administrators will not want to do without the management GUI. The lack of encryption makes it more difficult to comply with requirements, however.
Tabelle 3: Benno MailArchiv
Manufacturer |
LWsystems GmbH & Co. KG (http://www.benno-mailarchiv.de) |
Price |
Free as Open Benno, as a Small Business Edition (SBE) with up to 20 mailboxes, as a Standard Edition (SE) for 20 mailboxes, and a Hosting Edition (HE). SBE with five mailboxes for EUR80 per year. SE for a price of EUR12.50 per mailbox and year. Volume discounts are available. |
Technical data |
http://www.benno-mailarchiv.de/produkt/uebersicht/standard_edition.html |
Evaluation (max. 10 points) |
|
---|---|
Installation overhead |
7 |
Feature scope |
7 |
User friendliness |
5 |
Integration options |
8 |
Documentation |
6 |
Overall score |
6.6 |
To make up for this, Benno offers interfaces for provisioning, user management, and web services that give administrators the ability to integrate the application seamlessly. The community edition may fit the bill for many uses; however, it does not meet all the legal requirements.
MailArchiva Enterprise Edition v4
MailArchiva [6] is a comprehensive archiving system specially designed for larger environments with many mailboxes (Figure 5); it advertises good scaling capability, in particular for the fully supported commercial version, but a feature-stripped Open Source Edition [7] is also available.
One of the special features of MailArchiva is its comprehensive support for MS Exchange; like many other advanced features, this is only available in the Enterprise Edition. MailArchiva natively supports all Exchange versions and multiple Exchange Stores. Outlook users can access the archive directly using plugins from within the mail client.
MailArchiva also offers comprehensive support for many of the popular mail server flavors, such as Postfix, Sendmail, Qmail, iMail, Lotus Notes, AXIGen, Communigate Pro, Neon Insight, Zimbra, and Google Apps.
Clear-Cut Architecture
The archiving program, which runs on Windows, Linux, Solaris, BSD, and OS X, creates the messages, including all headers, in zipped archive files directly on the filesystem and thus does without a database. The files are encrypted using Triple-DES. To save hard disk capacity, attachments in multiple mail messages are saved once only.
Archived data is organized into logical volumes, which can be segmented and stored on separate storage systems if desired. User authentication can be handled by OpenLDAP, Active Directory, or Google Apps for role-based access control.
Painless Installation
Installation has been neatly solved. In just three steps, administrators can set up a working system by unpacking the tarball after downloading, typing ./install
to launch the setup routine, confirming the license, and answering the prompt for the Max Heap Size (256MB is fine for test operations). The installation routine automatically launches the main process. You can stop this later, or restart it using /etc/init.d/mailarchiva
. After this, you can log in to the web console using the URL http://<Servername>:8090. For your initial login as the administrator, use the admin account and – contrary to what the documentation says – the automatically set admin password.
After the initial login, change the administrator's master password below the Login menu item; otherwise, the system will reject any administrative changes. In our lab, this only worked after I created an empty server.conf
file in the /usr/local/mailarchiva/server/webapps/ROOT/WEB-INF/conf
path. Next, create an encryption password in the Storage groups menu item; this is used for encrypting all archives with the Blowfish algorithm.
Fast Search
The web GUI offers much convenience. You can configure all the critical system parameters, define rules for archiving and retention, manage certificates, and start the integrated backup routine. The feature scope also includes integrated monitoring, which returns data in JMX format. The web GUI comes with an advanced search routine that can be distributed across multiple servers for performance benefits. In addition to the option of defining and storing your own queries, there are comprehensive export options for archived email, including the ability to generate reports in PDF format. As a special feature, you can integrate the search with Outlook and keep the familiar Outlook look and feel in doing so.
On Premises or in the Cloud
MailArchiva also offers an ISP Edition, which enables hosting for service providers and is designed to be multitenant capable. This edition also contains features for automated billing and underlines its claim of being a complete solution.
Conclusions
MailArchiva leaves no wishes unfulfilled. It provides excellent feature scope in combination with support for very large environments and good scalability (Table 4). Additionally, small businesses will really appreciate the free edition.
Tabelle 4: MailArchiva
Manufacturer |
MailArchiva (http://www.mailarchiva.de) |
Price |
MailArchiva can be tested for 45 days free of charge. Enterprises with 20 mailboxes or fewer can use the software free of charge. Licensing is on a per mailbox basis; at least 25 mailboxes must be ordered at a price of EUR559. Extension licenses can be purchased in steps of 10. For a new investment, charges of 20 percent maintenance per annum are added for the first year. |
Technical data |
https://www.mailarchiva.com/enterprisefeatures?page=enterprise |
Evaluation (max. 10 points) |
|
---|---|
Installation overhead |
9 |
Feature scope |
9 |
User friendliness |
8 |
Integration options |
8 |
Documentation |
8 |
Overall score |
8.4 |
The tests reveal that a solution is available for any budget and set of requirements. The feature scope of all the candidates presented here is huge, and the overhead for implementation and operation is typically manageable.