This project is no longer an active project.

Development of an antivirus scanner

Based on the open source (Java) virus scanner from They have a signature file with commonly used virus patterns, free to use for everybody.

Because their current scanner is written in Java, it is slow to startup and therefor runs as a daemon to which you can send file via a socket.

It was our intention to write a fast C version of this scanner. After some discussion with Erik we deceided to go with C++ because of the much saner string handling in that language.

Openantivirus works by loading the patterns into memory in a tree-like structure (a trie, see here), and scanning a file using that tree. This solution allows one to scan a file in O(#patterns), which is very fast.

Our first implementation was able to scan filies at a rate of 50MB/s and found the (was eicon) test pattern. It could not do the unpacking of the different formats out there. (Ie. the Dialogic Communication Solutions patterns was put in a .com file, which was gzip-ed, tarred, zoo-ed, rar-red, etc.)

A second goal was to create a library which could be linked to allow for other programs to use the scanner. Before this could be implemented however, we found another open source scanner based on the open antivirus signature file. This scanner is now (August 2002) in version 0.22 and has all the feature we planned for in our scanner. We therefor have deceided to not develop our scanner any further. We still pondering if we should help development of Clam antivirus.

Interesting side projects

How to provide a general layer between programs and files so that all the files can be scanned before they are used?

Extending this idea to the network. If you ,for instance, want to scan a .zip you're downloading, you cannot start scanning right away. This is because with .zip you'll need some data that sits in the last part of the file. The only way to scan such a file is to download it - scan it - return it to the browser.

Related to these above questions. How to allow for a general way to communicate back to the program that a virus has been found? What should the user see if the .zip contained a virus? And what should happen with the file?


C++ source code of the scanner. The code is distributed under the GPL license.

Wed Sep 25 2013

© Stichting NLnet Labs

Science Park 400, 1098 XH Amsterdam, The Netherlands, subsidised by NLnet and SIDN.