back to article Splunk dabbles in edgy hardware, lowers data ingestion

Splunk has released a major update to its core data-crunching platform, emphasizing reductions in the quantity of data ingested and therefore the cost of operations. It also addresses a few security flaws that may not be fixable in earlier editions. The release is called Splunk 9.0. As explained to The Register by Splunk …

  1. Nate Amsden

    maybe a new feature...

    But I have been filtering and dropping data before it gets into Splunk for many years by sending it to the nullQueue via regular expressions in transforms.conf, a very well documented ability of Splunk.

    Not only did it reduce the license costs but it also dramatically cut down on the amount of sheer crap that was going into the indexes introducing a lot more noise making it harder to find things. Simply removing the HTTP logs from our load balancer health checks that returned success per my notes back in 2018 saved nearly 7 million events per day. Overall at that time I removed roughly 22 million events per day for our small indexers that at the time were licensed for 100GB/day. Included in that was 1.5 million of useless windows event logs(these were more painful to write expressions for, one of which is almost 1,500 bytes for the expression). We had only a handful of windows systems, so absurd they generated so many events! 95%+ linux shop.

    The developers for our main app stack also liked to log raw SQL to the log files which got picked up by Splunk. I killed that right away of course(with the same method) when they introduced that feature. I also documented in the config the exact log entries that matched the expressions to make it useful for future maintenance.

    Don't get me wrong it wasn't a fast process it took many hours of work and spending time with regex101.com to work out the right expressions. Would be nice (maybe fixed now not holding my breath) to be able to make Splunk config changes to the config files and not have to restart splunk (instead perhaps just tell it to reload the configuration).

    VMware esxi syslogs are the worst though, I have 59 regexes for those, which match 200+ different kinds of esxi events, at least with ESXi 6 the amount of noise in the log files is probably in excess of 90%. vCenter has a few useful events though I'd guesstimate noise ratio there at least 60-75%.

    I had been using nullQueue for the past decade but really ramped up usage in 2017/2018.

  2. Numen

    Offload?

    Too bad they can't offload the filtering/puck into a SmartNIC, maybe on entry to the Splunk server. Their "puck" project might be ideal for this. I'd certainly be glad to help with that!

  3. Numen

    Sounds like an offload owuld help

    It certainly looks like this filtering would fit into a SmartNIC very nicely. Their "puck" project sounds like a good fit for this. Or add filtering in a SmartNIC on entry to the Splunk server. I'd be glad to help with that!

POST COMMENT House rules

Not a member of The Register? Create a new account here.

  • Enter your comment

  • Add an icon

Anonymous cowards cannot choose their icon

Other stories you might like