Logstash vs Fluentd – Which one is better !
Logstash-Vs-Fluentd

When it comes to collecting and shipping logs to Elastic stack, we usually hear about ELK – Elastic, Logstash and Kibana. It has almost become a synonym for Elastic stack. But when it comes to technology, there is never a perfect solution.

Fluentd is gaining popularity as far as logging for microservices (in Docker /Kubernetes environment) is concerned. This is due to the fact that Fluentd is built by Treasure Data and is part of CNCF. So Fluentd has much better integration with CNCF hosted projects like Kubernetes, Prometheus, OpenTracing etc.

Recently, I got a chance to evaluate these two log collectors for a project and it gave me a chance to deep dive and look at the pros and cons for both and then decide which is better for what use case. The purpose of this comparison is not to choose a winner but to find the one more suitable one for your use case.

Logstash Vs Fluentd – Basic Comparison

LogstashFluentdComment
Endorsing CompanyLogStash is part of the popular ELK stack.Fluentd is built by Treasure Data and is part of the CNCF.Fluentd also has excellent support for Elastic. For CNCF hosted project (e.g. Kubernetes, OpenTracing or Prometheus), Fluentd could be a better choice.
Code LanguageJRuby – so it requires java runtime on the host machine.CRuby – No java runtime requiredFluentd has advantage here as no java runtime required for Fluentd
Event RoutingEvents are routed based on if-else conditions.
Eg-
output { if [loglevel] == “ERROR” and [deployment] == “production” { { … }
Events are routed based on tags.
Eg-
<match logtype.error> type … </match>
Fluentd has better routing approach as it is easier to tag events then use if-else for each event type.
If there are too many different types of logs then logstash filters can be difficult to manage.
PluginsAbout 200 pluginsAbout 500 pluginsLogstash has all plugins in the official git repo while Fluentd still does not have a centralized repo for plugins.
PerformanceConsumes more memoryConsumes less memoryFluentd has a slightly better reputation wrt performance. But both these have light weight log shippers so performance wise both are very efficient.
TransportLogstash is limited to an on-memory queue that holds 20 events (fixed size) and relies on an external queue like redis or kafka for persistence across restarts.Fluentd has a highly configurable buffering system that can be in-memory or on-disk with a seemingly endless array of parameters.
As discussed in this talk at OpenStack Summit 2015, both perform well in most use cases and consistently grok through 10,000+ events per second.
Enterprise SupportYes – LinkYes – LinkLogstash has better enterprise support as it is part of official Elastic stack.
Platform supportBoth Linux and WindowsBoth Linux and Windows

Lets looks at both from Use-Case perspective –

Use Cases

Log Collection

Fluentd – Docker has built-in logging driver for Fluentd. This means no additional agent is required on the container to push logs to Fluentd. Logs are directly shipped to Fluentd service from STDOUT and no additional logs file or persistent storage is required.

Reference Link – https://docs.docker.com/config/containers/logging/fluentd/

Example- Add below section to the service in docker compose file-

logging:
  driver: "fluentd"
  options:
    fluentd-address: <fluentd IP>:<fluentd service port>
    tag: testservice.logs

Logstash – The application logs from STDOUT are logged in docker logs and written to file. The logs from file then have to be read through a plugin such as filebeat and sent to Logstash.

Log Parsing

Fluentd has standard built-in parsers such as json, regex, csv, syslog, apache, nginx etc as well as third party parsers like grok to parse the logs.

Reference – https://docs.docker.com/config/containers/logging/fluentd/

Logstash has more number of plugins for filtering and parsing like aggregate, geoip etc in addition to the standard formats.

Reference – https://www.elastic.co/guide/en/logstash/current/filter-plugins.html

Metric Data Collection

Fluentd doesn’t have out of the box capability to collect system/container metrics. It can however scrape metrics from a Prometheus exporter.

Logstash uses Metricbeat which has out of the box capability to collect system/container metrics and forward it to Logstash. Additionally Logstash can also scrape metrics from Prometheus exporter.

Scraping

Fluentd has plugin http_pull which provides capability to pull data from http endpoints like metrics, healthchecks etc.

Logstash has http plugin (supported by elastic) which provides capability to pull data from http endpoints.

So which one to use?

Looking at the above use cases, it should be clear that both Fluentd and Logstash are suitable for certain requirements. Best part is both can co-exist in same environment and can be used for specific use cases.

For monolithic applications on traditional VMs, Logstash looks like a clear choice and way to proceed as it supports multiple agents for collection of logs, metrics, health etc.

For microservices hosted on Docker/Kubernetes, Fluentd looks like a great choice considering built in logging driver and seamless integration. It supports all commonly used parsers like json, nginx, grok etc.

ELK-EFK-architecture
ELK-EFK Hybrid Architecture

In a hybrid environment, both can coexist and support their use cases. This way you get the best out of both !!

Hope this clears some confusion and helps you in making a better decision !

We have a post to get started with EFK –

image-22
https://www.techmanyu.com/microservices-logging-using-efk/

Do share your thoughts and experience with us through comments !

Categories
Comments
All comments.
Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.

  1. Bryan Zochski

    Thanks for such detailed comparison. But I seriously doubt about the community support for Fluentd as compared to Logstash. Logstash has many more plugins and community support. Your views on that?
    Any link to help do a poc on EFK?