Chapter 3: Collecting and Processing Forensic Data

discusses the actions involved in gathering and standardizing
forensic data for cyberthreat management purposes.

Data Generation:

1.Understand the need to collect data from a wide variety of
sources and the role each data source plays
2.Get a glimpse of the behind-the-scenes processes that transform raw data into standardized data and metadata ready for
3.Learn important considerations for long-term data archiving

Enterprise security controls:

Do your homework before reconfiguring security event logging
capabilities on enterprise security controls, endpoint software,
or other technologies. A single change could greatly increase
logging, overwhelming local logs and even the centralized log
management infrastructure.

1.Vulnerability remediation, such as vulnerability
management and patch management software
2.Attack detection, including antivirus software and
intrusion prevention systems
3.Network technologies, such as firewalls, virtual
private networking, and remote access solutions
4.Identity and access management technologies
5. Data loss prevention (DLP) and other exfiltration
detection solutions

Endpoint software

Endpoint software is also important for providing context for
security events. Endpoint software often has the greatest visibility into the endpoint’s configuration and use, so endpoint
logs may provide context for better understanding the significance of a particular event.

Network flow data

The analysis of network flow data is an important part of
threat management because it can indicate both policy
violations and significant deviations from typical patterns.
This traffic may come from compromised systems, major data
exfiltration attempts, and other serious problems that might
otherwise go undetected because of traffic encryption or other
ways of avoiding content analysis.

Asset data

A final important source of data for threat management is
information on the organization’s IT assets. For example,
knowing the role of each system (server, client, network
infrastructure, etc.) may help in prioritizing response efforts.
Other asset data that may be useful for each system includes
the installed software, the primary user or administrator, and
the relative importance of the system

Data Transfer

The data being generated about ongoing security events needs
to be transferred to a central location for threat management
purposes. This isn’t a simple matter of replicating all log data.
Issues include keeping bandwidth usage at reasonable levels
while transferring all necessary information, and ensuring the
confidentiality, integrity, and availability of the log data while
it’s being transferred.

1.Event aggregation, which involves replacing a
number of related log entries with a single new
2.Reduction, which is intentional and automatic
dropping of unnecessary events or event fields
3.Compression, which is storage of the entries from
a log in an alternate format that uses less space
without any loss of information


1.Confidentiality. Log data often contains highly
sensitive information that must be inaccessible to
any unauthorized personnel monitoring network

2.Integrity. Attackers would love to modify log data
to conceal their nefarious activities.
3.Availability. Log data must not be lost; for
example, if a network interruption occurs, log data
transfers must resume shortly after the interruption ends without losing any of the data awaiting

Most security compliance initiatives
require organizations to perform
extensive security auditing and
to report on the results of these
audits. SIEMs typically provide
built-in support for conducting
audits and generating audit reports
for all major security compliance
initiatives. This can be a big time
saver for organizations that have
compliance needs.

Data Normalization

Data normalization is a complex process that converts log
data from its original format to a descriptive standardized
format to facilitate its use with search and machine analytics.
Normalization takes away the overhead and errors involved
in attempting to interpret thousands of log formats.

Extraction of key data fields

Logs come in many formats, from comma-delimited text files
to proprietary binary formats. The first step in normalization
is to parse each log, identify the significance of each data field,
and extract the values from the data fields that the SIEM

Standardization of values

After the SIEM extracts values from logs, these values need to
be standardized so they’re represented consistently. Suppose
that one log format stores source IP addresses as text fields
and another stores them as hexadecimal values. Many other
representations are also possible.

Timestamp standardization

Timestamps are a special case of value standardization. The
SIEM converts them to a single standard representation, just
as it does with other values extracted from data fields, but
the SIEM also usually performs additional normalization
on timestamps to ensure they’re accurate and consistent.
Examples are ensuring that all timestamps represent the time
of day using the same time zone, and correcting for any known
inaccuracies in a source’s clock. Timestamps must reflect the
actual time when each event occurred so they can be put in
the proper sequence and analytics performed based on those

Event classification

Perhaps the most important aspect of data normalization is
event classification. A SIEM can greatly enrich the value of log
data by determining what type of event is represented by each
log entry or group of entries. This allows the SIEM to better
understand the significance of each event and what impact an
event or sequence of events may have.

Data Archiving

Data archiving is the process of moving data from the SIEM’s
primary centralized storage to secondary storage, such as a
storage area network (SAN). Unlike data generation, transfer,
and normalization, which all usually occur before security
analytics, data archiving happens after security analytics have
been applied.

———————————————Collected from Definitive Guide™ to Security Intelligence and Analytics————————————————————————–————————————

Shajib Mahmud

Leave a Reply

Your email address will not be published. Required fields are marked *