Wrapping data considerations
A shorter blog for today, as we wrap some of the data considerations that we addressed in Part 3 of this series of blogs. We saw that there are no massive savings in data collection and storage when using the same tool for security, ITOM and APM use cases, because the data is not the same across these use cases. Today, more on that topic.
6. Different severity levels from the same log from the same data source
In the syslog universe, another important parameter is used to request data sources to generate more or less logs, with more or less information in each log. This parameter is well known to data engineers and is called “severity level”, also known as verbose level.
The higher the verbose level, the more logs a data source will generate, and the more information, i.e., more fields will each log have. The 8 levels of severity are described in the table below:
“Debug mode” is extremely chatty and will put a burden on systems and networks, as it will generate a very rich and information-heavy log every time that anything happens. This is a lot of logs, each with many fields. It is very valuable when debugging a performance problem, but overkill for most security use cases.
The sweet spot for security use cases is often “Informational”, but this level might not be verbose enough to get all the details and telemetry required to optimize the performance of a particular application and to deliver on APM use cases.
There is no ideal severity level to obtain the right amount of logs with the right fields when trying to accomodate for security, ITOM and APM use cases. In fact, this can be described in the now familiar Venn diagram below:
Once again, using the same tool for security, ITOM and APM does not yield major economies on collection and storage costs versus having dedicated tools. However, as we’ll see in the next blog, the impact on parsing will have a negative impact on compute costs.
7. Different context enrichment sources
Raw data used by ITOM, APM/observability, and security tools often requires context enrichment from various sources. Consider the disparity in insights provided by these two descriptions of the same event generated by an endpoint agent:
1. Error code <123> from IP address 192.168.30.30
2. An authentication failure occurred (indicated by error code <123>) at the workstation with IP address 192.168.30.30. This server operates on Windows 11 version 22H2, is owned by John, and has a MAC address of d0:89:0d:8b:f6:22.
To transition from the first, less informative representation to the second, more detailed, context-rich one, raw event logs need to be context-enriched using various internal and external sources, via lookup and/or correlation. In this case, these sources would be:
- Error code table from the endpoint agent
These sources may not be the ones that ITOM or APM use cases require. APM use cases for example may require visibility in the end-to-end operational model of the application being managed, along with telemetry linked to latency from each device in the path.
The need for context enrichment is directly linked to the insights we aim to extract, and these insights are derived from the required use cases and outcomes. Different outcomes and use cases will demand unique context enrichment sources. If too many disjointed enrichment sources are used to serve unrelated use cases, the operational cost of the tool can rise, often offsetting any potential savings associated with a joint tool. User identity and role, for example, is critical to security use cases. Therefore, it’s important to carefully consider the cost-benefit balance when deciding on toolsets.
Next blog will address analytics methods. Another reason why organizations should focus their security tools on security use cases.