Apache httpd piped logs drawbacks

Understanding Apache httpd piped logs feature and its impact on performance
#

Apache httpd, for a while, is capable of writing log files through a pipe to another process via the CustomLog directive in mod_log_config module.

While this feature can be convenient, it introduces subtle risks that can impact web performance.

A slightly unusual stack
#

Let’s imagine an ELK logging stack, just a bit out of the ordinary: due to the amount of log sources, and the unpredictability of log spikes, a “buffer” layer, leveraging RabbitMQ, is deployed in front of Elasticsearch. This is nothing unheard of and can be life saver, granted it is handled correctly.

Now let’s stretch it a bit more and imagine that Logstash plays a dual role: it acts as the data aggregation and processing layer, moving logs from the buffer layer (RabbitMQ) to the indexing and storage layer (Elasticsearch), but it also acts as the data collection agent, running on every system and shipping logs to the buffer layer.

flowchart LR
    A["Logstash"] --> B["RabbitMQ"]
    B --> C["Logstash"]
    C --> D["Elasticsearch"]
    E["Kibana"] --> D

So, Logstash is both the log processor/filter and producer. Why you ask? Because why not? Logstash is completely fine at ingesting data, it is solid, has a persistent queuing system and allows for transformation of ingestion.

An unusual `httpd` logging configuration
#

Ok so you now have a working logging system, let’s start ingesting some logs!
But wait, the SOC team mandates that logs should be collected preventing any form of tampering. That means that you cannot write logs to file and then ingest it, as something could modify the content before it being collected.

As you may know, Apache httpd writes its logs on file. That’s a bummer. However, piped-logging could rescue us! This feature comes very handy, especially because it adds flexibility to httpd without having to integrate external systems in the web server code.
It enables httpd to write access and error logs through a pipe to another process: the piped process will be spawned by the parent httpd process at startup and each child process will write log entries to that pipe, which is somehow connected to the piped process standard input.

However, we cannot pipe to Logstash directly, we instead need some sort of channel that Logstash can listen to… Something like a socket… So why not to use an udp socket?!

We can pipe netcat and have it write the logs to a udp socket, which is handled via the Logstash udp input plugin.

flowchart LR
    A["httpd"] -->|pipe| B["nc"]
    B -->|udp| C["Logstash"]
    C --> D["RabbitMQ"]
    D --> E["Logstash"]
    E --> F["Elasticsearch"]
    G["Kibana"] --> F

The Logstash input config will look something like this:

input {
    udp {
        host => 127.0.0.1
        port => xxxx
        codec => plain
    }
}

The httpd config will instead be like:

CustomLog "|/bin/nc -u 127.0.0.1 xxxx" combined

The trouble begins
#

One night, while on-call, you are woken by the sadly well-known pager ring tone: multiple endpoints are experiencing high response times.

You start analysing metrics and logs, but the sleep deprivation does not help at all: something is wrong with the httpd MPM workers, they are heavy on the CPU and maxed out the MaxRequestWorkers directive count.
You decide to go all in and follow the old sysadmin mantra that goes like “Have you tried turning it off and on again?” and you restart the httpd instances: the load average starts decreasing and the response times are healthy again.

A mere couple of minutes of quiet and then the realization: the mantra is fallacious and reveals itself for what it really is, a bandaid, masking the underlying problem.

Suddenly, another alarm: this time it’s the RabbitMQ cluster which is struggling. Too much message inflow, queues are filling up and the RabbitMQ flow control is stepping in.
The issue seems completely unrelated to the httpd hiccup, yet as soon as you identify the culprit of the high log rate and queues starts to empty the web servers response times starts to recover, this time for good.

You get back to bed, puzzled.

A post-mortem analysis
#

What happened? What went wrong?
Well, apparently some RabbitMQ connections were publishing messages too quickly for queues to keep up, flow control applied the back pressure mechanism on those to avoid runaway memory usage growth. Not only connections but some queues too were put in the flow state, meaning the speed of connections was reduced for queues to keep up. These queues were exactly the ones used for httpd logging.
Why this happeded deserves a separate write-up, but this was indeed the smoking gun that triggered all that followed.

As RabbitMQ shat itself, the Logstash Memory Queues (MQ) filled up, as the publishing rate was forcibly reduced.
The MQs size upper bound is directly proportional to the pipeline.workers times the pipeline.batch.size Logstash parameters, which in a decently tuned Logstash installation are strictly related to CPU and Memory availability.

When a Memory Queue or even a Persistent Queue fills up, Logstash applies back pressure on the inputs to stall data flowing into the already full queue.
This mechanism helps Logstash control the rate of data flow at the input stage without overwhelming the output. This would really be helpful if queue is filling up as too many logs are produced, however in our case is the output itself the one causing the queue to fill up.

As the back pressure is applied on the input, the behavior is completely dependent on the chosen plugin: the udp input plugin, for instance, stops completely from reading datagrams from the udp socket. But hey, it’s udp in the end!
So, as Logstash stops reading datagrams, the udp socket starts reaching its SO_RCVBUF limit. When the buffer is full, received datagrams start being dropped.
In a normal scenario, the receiver buffer maximum size or rmem_max being reached on the receiver host, could not cause any harm to the sender being udp a connectionless protocol by design. On the contrary, the send buffer SO_SNDBUF maximum size or wmem_max could cause send() system calls to block if the udp socket is in blocking mode.
Obviously, as luck is clearly not on our side, nc creates a standard BSD socket, which is blocking by default.

Now things get tricky: in this particular situation, the sender and the receiver are the same host, thus the Linux Kernel Zero-Copy loopback stack enters the scene! That stack is Linux’s internal optimization for sending data to itself: when the kernel sees both ends of a socket live in the same process or host, it short-circuits the data path, bypassing the NIC, driver stack, checksum offload and IP routing. Data is enqueued in one socket and instantly dequeued from the other, resulting in near-RAM speed.

This behavior however has a subtle drawback: datagrams are sent straight to the receive queue and if the receive queue is full too the udp socket behaves as it would do with a full send queue: it blocks send() syscalls.
Blocking these syscalls means that the nc process, used by httpd to pipe logs, blocks too.

httpd mpm workers behave synchronously: they read the http request, process it, send the response and then proceed to write the log entry by calling write() to the logging pipe. After completing these steps the worker is returned to the idle pool or killed. Each worker shares the write-end of the pipe, which is created by the parent httpd process at the server startup and restarted in case it crashes.

So the httpd worker is writing to the write-end of a buffered pipe whose read-end is attached to a process (nc) which has stopped reading: the pipe’s buffer fills up and once it is full httpd calls to write() block as well.
As the workers behave synchronously, they remain blocked on the logging phase, draining the mpm worker pool and at the end causing the httpd process to stop replying to requests completely.

flowchart TD
    A["httpd worker"] -->|"write()"| B["Pipe (~64 KB)"]
    B --> C["nc process"]
    C -->|"send()"| D["udp socket (on loopback)"]
    D -->|"rcvbuf full"| C
    C -->|"stops reading stdin"| B
    B -->|"pipe full"| A

Full chain of causality
#

RabbitMQ queues fill up
Back pressure is applied, reducing the publishing rate
Logstash cannot send logs: memory queues fill up
Input plugin is slowed down: udp receive queue fills up
nc blocks on send() as udp socket is in blocking mode and uses zero-copy loopback stack
nc stops reading from stdin: pipe’s buffer fills up
httpd mpm workers block on write() waiting for pipe’s buffer availability
mpm worker pool is drained: httpd stops serving requests

Key insights
#

The one reported above is a case of back pressure propagation through a pipeline, and although the causes are quite particular, it shows a critical flaw in the Apache httpd piped logs inner working.

`httpd` logs synchronously in the worker thread
#

Every request-handling worker must write its log line before finishing the request.
Logging is not asynchronous, not delegated, and not buffered elsewhere.
The worker calls the write() system call directly.

Consequence: if the log write blocks, the request’s worker blocks.

`httpd` piped loggers use a single shared pipe
#

A single piped logger process is started at httpd startup.
A single pipe is created.
All workers share the same write-end fd.

Consequence: this sets up a single chokepoint for all logging.

Pipes in Linux block writers when full
#

Linux pipes have limited capacity (~64 KB by default).
If the piped process does not read the pipe’s buffer becomes full.

`nc` uses blocking `udp` send calls
#

Performs blocking send() syscalls.
Does not drop datagrams when the receiver is slow.

Consequence: nc itself can block even before the pipe is full.

`nc` stops reading from its stdin when blocked
#

When blocked on any syscall, it stops reading anything from stdin.

Loopback `udp` can block when the receive queue is full
#

Contrary to the myth “udp never blocks”, it can block when:

socket is in blocking mode;
traffic is on loopback;
receiver’s socket receive queue is full;

Conclusion
#

Whoever designed this logging stack probably picked udp as it is believed to avoid blocking behavior and to behave simply enough to be predictable and to cause no harm to any other stack component.
This is overall true and as a matter of fact udp itself was not the true culprit of the issue, which can instead be found in the httpd piped logs feature: the pipe, not udp, caused httpd to stop serving requests.

Piped logging is convenient, but could be dangerous under load because mpm workers depend on the logger’s ability to drain its pipe. Any slowdown cascades directly into request handling.
For high-reliability systems, logs should be buffered, queued and handled asynchronously, never synchronously in the worker path.

Now you can really get back to sleep!

Understanding Apache httpd piped logs feature and its impact on performance #

A slightly unusual stack #

An unusual httpd logging configuration #

The trouble begins #

A post-mortem analysis #

Full chain of causality #

Key insights #

httpd logs synchronously in the worker thread #

httpd piped loggers use a single shared pipe #

Pipes in Linux block writers when full #

nc uses blocking udp send calls #

nc stops reading from its stdin when blocked #

Loopback udp can block when the receive queue is full #

Conclusion #