Antivirus and database systems (PostgreSQL in focus)
Many users are required to run virus scanners on all of their IT systems due to vendor requirements or working instructions. This general requirement can lead to problems for IT systems that are neither desktops nor simple file servers.
Usual interpretations of baseline security requirements demand virus scanning to be enabled with antivirus software available on all systems and signatures up-to-date. The compliance with those requirements can be controlled via monitoring.
However, problems will turn up if on-access scanners are active and interfere with the file system. The PostgreSQL developers even strongly discourage the usage of antivirus software!
What are (on-access) virus scanners and what are they needed for?
Most end-user software does not clearly differentiate between code and data. As a regular text file only contains data and no program code there is no necessity for parts of it to ever be interpreted or run as code. Unfortunately this separation is not practised widely, so many files like Office documents can contain both data and code. This results in an attack vector where attackers can hide code in data which is then executed unnoticed by the user.
Since the underlying problem is complex to solve and no positive tendency is apparent, virus scanners offer a relief of symptoms in many areas. For example, it is common practice and useful to validate and check incoming external data and untrustworthy user input. The input is compared with a blacklist of known attack patterns and an appropriate action is taken in the event of a match.
Even if new, tailor-made or unknown attacks cannot be detected by such systems, the cost-benefit calculation can make it worth the effort.
Working with the File System
If an application (A) wants to read or write data, the interaction with the storage medium (B) takes place via system calls. The application transfers the data to be written and receives a confirmation whether the data was written successfully or, in the event of an error, a corresponding error code.
How real-time scanners work
A real-time scanner or on-access scanner intervenes in this process and positions itself as a man-in-the-middle between application (A) and the hardware (B). The Linux kernel has been offering the fanotify API for some time now. However, different vendors often use their own, non-standardized procedures to intervene in the data stream. There are various possible intervention points - e.g. the applications themselves or the I/O libraries used - but interventions in the file system are also used.
The interception of I/O accesses makes them much more complex. Depending on the point of intervention and the scan engine in use, memory accesses become orders of magnitude slower. On office desktop systems or stateless application servers, there are often no noticeable effects due to low I/O and only few writes. On write-heavy systems, however, these effects can be severe. If many small chunks are to be written and not only the average data throughput is involved, but also the rapid processing of individual writes with low latencies, large losses are to be expected.
Database systems are particularly affected due to their special usage patterns and complex service functionality. In addition, CPU time is also required during the actual scanning process. Depending on the antivirus solution used, bottlenecks can also occur here.
Data security problems
If only the kernel APIs are used to intercept I/O requests, the theoretical error probability already increases as more code is executed and operations become more complex. In practice, this complicates debugging and troubleshooting, but it is unlikely to result in a significant loss of data security, assuming that the API is used correctly. However, fanotify is currently not the standard in most antivirus solutions.
By implementing their own I/O hooks a great potential for errors is introduced. For example, when determining the causes of a corrupt database system, we could see that the antivirus solution sometimes did not forward I/O errors from hardware and drivers.
- The database sends a write request to the file system (write 8k-block to disk)
write()is intercepted by the antivirus software
- The virus scanner checks the data (no virus found)
write()is passed on to the file system
- The file system transfers the data to hardware via the device driver
- A hardware problem occurs, the data cannot be written!
- The device driver notifies the file system about the error
- The file system notifies the virus scanner about the error
- The virus scanner fumbles the error and notifies a successful write to the database
- The Database notifies the client of the successful transaction
This way a hidden data loss was created that was only discovered once the users of the system reported errors in the data. Due to its non standard-conforming behaviour the virus scanner has sabotaged one of the core functionalities of the database system.
Contrary to other products, PostgreSQL separates between data and code. Entries in tables, even if in binary form, cannot be executed by design.
Due to this strict separation it is not necessary for PostgreSQL to check its own data for viruses. However, it might be useful for application software to verify its part of the data. If e.g. emails or files possibly containing code (like office documents) are stored in the database, it can be valuable to check these columns for known malicious software.
In most practical setups, a generic real-time scanner is used, which checks not only the required parts of the date, but uses on-access methods. This leads to all of the problems mentioned in the first part.
Malware found, what now?
To make matters worse, it is very difficult to process malware finds with external software. What should happen, for example, if a trojan is found?
For example on a desktop system, the affected file is usually to be quarantined or deleted. However, such behavior is not compatible with running a database server. Since the files are the physical representation of the database, this mechanism causes data corruption, which may, for example, manifest itself as a silent data loss or violation of conditions of consistency such as the failure to recognise duplicates.
In practice, it usually makes sense to only generate alarms which are then processed by an administrator. But what is the appropriate procedure if the virus scanner reported an infection of the file
/var/lib/postgresql/11/main/base/13090/1255 last night?
- Where is the infected code in the database?
- Where did it come from?
- Is the message even correct and not a false positive?
Are there already satisfying solutions?
The described status quo obviously does not meet all requirements. Is there a better solution for databases (PostgreSQL)?
Currently the most common method is to check in the application or external systems. Several vendors also offer network virus scanners that monitor and scan network traffic. But here, too, there are many of the conceptual weaknesses already mentioned, e.g. how a finding is processed. Intervention in the data stream can have massive side effects.
An approach to the problem
It would be desirable to handle the scanning for and treatment of malicious code directly in the database system. This should preferably be possible via SQL so that developers or DBAs can integrate the functionality into regular application operations. This would allow to precisely define which inputs are to be checked. Positive hits can then be reported via usual SQL error codes and would no longer endanger the service.
One possibility would be a table with a text column in which user entries are stored. Then one could check before
UPDATE whether known malicious code is present and decline the change via a
In order to show the feasibility of such an integrated antivirus solution, the author has published the pg_snakeoil PostgreSQL extension. It exposes the ClamAV functions to SQL and allows for malware scanning at full ACID-compliance.
Questions and Help
Do you have any questions or need help? Feel free to write to firstname.lastname@example.org.
This blog post originally appeared on the german credativ blog.