Discover millions of ebooks, audiobooks, and so much more with a free trial

Only $11.99/month after trial. Cancel anytime.

Logging and Log Management: The Authoritative Guide to Understanding the Concepts Surrounding Logging and Log Management
Logging and Log Management: The Authoritative Guide to Understanding the Concepts Surrounding Logging and Log Management
Logging and Log Management: The Authoritative Guide to Understanding the Concepts Surrounding Logging and Log Management
Ebook831 pages9 hours

Logging and Log Management: The Authoritative Guide to Understanding the Concepts Surrounding Logging and Log Management

Rating: 4 out of 5 stars

4/5

()

Read preview

About this ebook

Logging and Log Management: The Authoritative Guide to Understanding the Concepts Surrounding Logging and Log Management introduces information technology professionals to the basic concepts of logging and log management. It provides tools and techniques to analyze log data and detect malicious activity. The book consists of 22 chapters that cover the basics of log data; log data sources; log storage technologies; a case study on how syslog-ng is deployed in a real environment for log collection; covert logging; planning and preparing for the analysis log data; simple analysis techniques; and tools and techniques for reviewing logs for potential problems. The book also discusses statistical analysis; log data mining; visualizing log data; logging laws and logging mistakes; open source and commercial toolsets for log data collection and analysis; log management procedures; and attacks against logging systems. In addition, the book addresses logging for programmers; logging and compliance with regulations and policies; planning for log analysis system deployment; cloud logging; and the future of log standards, logging, and log analysis. This book was written for anyone interested in learning more about logging and log management. These include systems administrators, junior security engineers, application developers, and managers.
  • Comprehensive coverage of log management including analysis, visualization, reporting and more
  • Includes information on different uses for logs -- from system operations to regulatory compliance
  • Features case Studies on syslog-ng and actual real-world situations where logs came in handy in incident response
  • Provides practical guidance in the areas of report, log analysis system selection, planning a log analysis system and log data normalization and correlation
LanguageEnglish
PublisherSyngress
Release dateDec 31, 2012
ISBN9781597496360
Logging and Log Management: The Authoritative Guide to Understanding the Concepts Surrounding Logging and Log Management
Author

Kevin Schmidt

Kevin J. Schmidt is a senior manager at Dell SecureWorks, Inc., an industry leading MSSP, which is part of Dell. He is responsible for the design and development of a major part of the company’s SIEM platform. This includes data acquisition, correlation and analysis of log data. Prior to SecureWorks, Kevin worked for Reflex Security where he worked on an IPS engine and anti-virus software. And prior to this he was a lead developer and architect at GuardedNet, Inc.,which built one of the industry’s first SIEM platforms. Kevin is also a commissioned officer in the United States Navy Reserve (USNR). Kevin has over 19 years of experience in software development and design, 11 of which have been in the network security space. He holds a B.Sc. in computer science.

Read more from Kevin Schmidt

Related to Logging and Log Management

Related ebooks

Information Technology For You

View More

Related articles

Reviews for Logging and Log Management

Rating: 4 out of 5 stars
4/5

2 ratings0 reviews

What did you think?

Tap to rate

Review must be at least 10 words

    Book preview

    Logging and Log Management - Kevin Schmidt

    Inc.

    Preface

    Welcome to Logging and Log Management: The Authoritative Guide to Understanding the Concepts Surrounding Logging and Log Management. The goal of this book is to provide you, the Information Technology (IT) Professional, with an introduction to understand and dealing with log data. Log data comes in many forms and is generated by many types of systems. A long-running problem is what one should do with all this log data and how to analyze it. This book presents techniques and tools which can help you analyze your log data and find malicious activity.

    It used to be that system administrators perused log files to look for disk errors or kernel panics. Today system administrators often time do double duty as system administrators and security administrators. The need to better understand what to do with security log data has never been more important. Security analysts are among the group of IT professionals who must also keep up with log analysis techniques. Many seasoned veterans have learned under trial by fire mode. This books aims to distill down what many people have taken years to learn by presenting material in a manner which will allow you to understand the concepts quickly.

    Let’s talk about an issue that has recently come to the forefront: regulatory compliance. With the corporate oversight debacle that was Enron and others, regulatory compliance is now a central theme for many corporate entities. The focus is now on policy and procedure. Can you, as an IT engineer, show that Bob was unable to access his corporate email account after he was let go? These are the sorts of things that companies are being asked to prove. The system and network logging landscape is changing in these and many other ways.

    Intended Audience

    The audience for this book is anyone who is interested in learning more about logging and log management. Here are some profiles of individuals who should read this book.

    System Administrator: You may be a system administrator who just inherited the task of monitoring log data for your enterprise.

    Junior Security Engineer: Maybe you’re a newcomer to network security and want to learn about log analysis techniques.

    Application Developer: Maybe you are interested in building a log analysis system from the ground up. This book provides example code for doing just that. The entire book, however, provides excellent background for why log analysis is important. These areas should not be skipped.

    Manger: Managers can gain great insights into topics like log data collection, storage, analysis and regulatory compliance. As previously mentioned, these issues are evermore present in the corporate landscape and will continue to be where IT folks focus much of their time.

    Prerequisites

    It is assumed that you have a basic understanding of concepts like networking, operating systems, and network security. However, you don’t need to be a computer scientist or networking guru to understand the material in this book. Topics which require background information are presented with necessary detail. The Perl and Java programming languages are used to present most code examples. You don’t have to be a Java guru to understand or follow the code samples, so I encourage everyone to at least look over the examples when they come up.

    Organization of the Book

    The format of this book is one that builds upon each previous chapter. But having said this, many of the chapters can be read as one-offs. There are 22 chapters in this book.

    Chapter 1: Logs, Trees, Forest: The Big Picture

    Chapter 1 provides background information on logging systems. If you are familiar with concepts like Syslog, SNMP, secure logging, log data collection, storage, etc., and such then you can safely skip this chapter.

    Chapter 2: What is a Log?

    Chapter 2 Takes time to describe what a log message is. Discussions include why logs are important.

    Chapter 3: Log Data Sources

    Chapter 3 describes the Syslog protocol, SNMP, and the Windows Event Log. Additionally classes of log data sources are presented. This chapter will get you comfortable with

    Chapter 4: Log Storage Technologies

    This is a great chapter if you want to learn more about log retention, storage formats, and storing logs in a relational database management system (RDBM). We even present examples of how Hadoop can be used for this endeavor.

    Chapter 5: Case Study: syslog-ng

    This chapter provides insight into how syslog-ng is deployed in a real environment for log collection. We also discuss some of the more advanced features of syslog-ng.

    Chapter 6: Covert logging

    If you have the need to collection logs in a covert manner, this chapters provides lots of detail on how to accomplish this task.

    Chapter 7: Analysis Goals, Planning and Preparation: What Are We Looking for?

    Before you begin analyzing log data, you first need to set goals, plan and prepare for the task at hand. Topics covered include looking for past bad things, future bad things, and never before seen things.

    Chapter 8: Simple Analysis Techniques

    Before discussing advanced analysis techniques, the basics need to be covered. This includes manual log analysis and the tools which enable this. In addition to this, we discuss an advanced tool which can make reading Windows Event Logs much easier. And of course we discuss the process of responding to the results of log analysis.

    Chapter 9: Filtering, Matching and Correlation

    This is an action packed chapter. Chapter 9 presents techniques and tools which can help you perform correlation in order to help find issues which simple manual log analysis may overlook. Topics covered include filtering, normalization, taxonomy, correlation, and some common patters for which to look. Two valuable sections in this chapter are for developers who are interested in building their own correlation engine. Jess and Esper are covered to show to build a rules-based and stream-based engine.

    Chapter 10: Statistical Analysis

    Chapter 10 discusses how statistics can be used to perform analysis. Frequency counting, baselines, thresholds, anomaly detection are covered. We even present how machine learning can be used for analysis.

    Chapter 11: Log Data Mining

    This chapter is devoted to log mining or log knowledge discovery – a different type of log analysis, which does not rely on knowing what to look for. This takes the high art of log analysis to the next level by breaking the dependence on the lists of strings or patterns to look for in the logs.

    Chapter 12: Reporting and Summarization

    Chapter 12 looks at reporting as a way of log analysis. We specifically focus on trying to define what the best reports are for log data.

    Chapter 13: Visualizing Log Data

    It is often useful to visualize log data. By visualize we don’t mean viewing alerts, emails, or whatever your particular log analysis system may emit. What we are more interested in discussing is viewing log data in the context of directed graphs and such.

    Chapter 14: Logging Laws and Logging Mistakes

    This chapter is about logging laws and log mistakes. Specifically, it covers common mistakes organizations have made (and, in fact, are making) with logs. It also covers some of the general rules and dependencies – perhaps too ambitiously labeled laws – that govern how organizations deal with logs.

    Chapter 15: Tools for Log Analysis and Collection

    This chapter provides a review of open source and commercial toolsets available for the analysis and collection of log data. The review will provide the reader many options to choose from when picking a toolset to manage log data on a daily basis. Examples of using the tools for log analysis are interspersed within the contents of the chapter with real-world examples of using the tools to review common logging tasks and scenarios. The chapter will help the reader review the set of tools available today and hopefully find the right tool for the job to help the reader get started analyzing logs in their organization today.

    Chapter 16: Log Management Procedures: Escalation, Response

    This chapter provides and introduction to log review, response and escalation for log management. Examples using Payment Card Industry (PCI) Data Security Standard (DSS) will be a running theme throughout this chapter. The idea is to illustrate how to apply the concepts in the real-world. This means examples are geared toward PCI standards, but they can easily be adapted and extended to fit any environment. In essence, this chapter develops a set of steps and procedures which you can begin using today. An added side benefit of this chapter is insight into interpreting and applying standards to log management.

    Chapter 17: Attacks Against Logging Systems

    This chapter covers attacks against logging, log analysis systems and even log analyst that can disrupt the use of logs for security, operation and compliance.

    Chapter 18: Logging for Programmers

    This chapter will be useful for programmers of all kinds. This includes system administrators, Perl programmers, C/C++ programmers, Java programmers, and others. Basically, anyone who writes scripts, programs, or software systems, will benefit from the content in this chapter. It has often been said that bad log messages are the result of bad programmers. While this is not entirely true, this chapter aims to change this notion by providing concepts and guidance to programmers and others on how better log messages can be produced. More ultimately, this will help with debugging, information gathering, parse-ability and increase overall usefulness of the log messages their software generates.

    Chapter 19: Logs and Compliance

    This chapter is about logging and compliance with regulations and policies. Chapter 19 will be a value to anyone who has to contend with regulatory compliance will want to read this chapter.

    Chapter 20: Planning Your Own Log Analysis System

    This chapter will provide practical guidance on how to plan for the deployment of a log analysis system. The chapter is not meant to provide a detailed blueprint on how to install any particular log analysis system. Instead material is presented so that you can apply the concepts to any log analysis deployment situation in which you find yourself. It will arm you with questions to ask and items to consider during any such undertaking.

    Chapter 21: Cloud Logging

    Cloud computing is a hot topic right now. And it’s only getting hotter. As we see traditional shrink-wrapped software migrate from company-owned data centers to the cloud (it’s already happening), the opportunity for IT managers to spend less capital expenditure (CAPEX) on hardware, switches, racks, software, etc., to cover things like log data collection, centralization and storage, and even Security Information and Event Management (SIEM) will be greatly reduced. This chapter introduces what cloud computing and logging are. It also touches on regulatory and security issues related to cloud environments, big data in the cloud, SIEM in the cloud, pros and cons, and an inventory of a few cloud logging providers.

    Chapter 22: Log Standard and Future Trends

    This chapter provides and expert opinion on the future of log standards and future developments in logging and log analysis.

    Chapter 1

    Logs, Trees, Forest: The Big Picture

    Information in this chapter:

     Log Data Basics

     A Look at Things to Come

     Logs Are Underrated

     Logs Can Be Useful

     People, Process, Technology

     Security Information and Event Management (SIEM)

     Case Studies

    Introduction

    This book is about how to get a handle on systems logs. More precisely, it is about how to get useful information out of your logs of all kinds. Logs, while often under-appreciated, are a very useful source of information for computer system resource management (printers, disk systems, battery backup systems, operating systems, etc.), user and application management (login and logout, application access, etc.), and security. It should be noted that sometimes the type of information can be categorized into more than one bucket. User login and logout messages are both relevant for both user management and security. A few examples are now presented to show how useful log data can be.

    Various disk storage products will log messages when hardware errors occur. Having access to this information can often times mean small problems are resolved before they become really big nightmares.

    As a second example, let’s briefly consider how user management and security logs can be used together to shed light on a user activity. When a user logs onto a Windows environment, this action is logged in some place as a logon record. We will call this a user management log data. Anytime this user accesses various parts of the network, a firewall is more than likely in use. This firewall also records network access in the form of whether or not it allowed network packets to flow from the source, a user’s workstation, to a particular part of the network. We will call this as security log data. Now, let’s say your company is developing some new product and you want to know who attempts to access your R&D server. Of course, you can use firewall access control lists (ACLs) to control this, but you want to take it a step further. The logon data for a user can be matched up with the firewall record showing that the user attempted to access the server. And if this occurred outside of normal business hours, you might have reason to speak with the employee to better understand their intent. While this example is a little bit out there, it does drive home an important point. If you have access to the right information, you are able to do some sophisticated things.

    But getting that information takes some time and some work. At first glance (and maybe the second one too) it can seem an overwhelming task—the sheer volume of data can alone be daunting. But we think we can help de-whelm you. We’ll present an overall strategy for handling your logs. We’ll show you some different log types and formats. The point of using different log types and formats is twofold. First, it will get you accustomed to looking at log messages and data so you become more familiar with them. But, second it will help you establish a mindset of understanding basic logging formats so you can more easily identify and deal with new or previously unseen log data in your environment. It’s a fact of life that different vendors will implement log messages in different formats, but at the end of the day it’s all about how you deal with and manage log data. The faster you can understand and integrate new log data into your overall logging system, the faster you will begin to gain value from it.

    The remainder of this chapter is geared toward providing a foundation for the concepts that will be presented throughout the rest of this book. The ideas around log data, people, process, and technology will be explored, with some real-world examples sprinkled in to ensure you see the real value in log data.

    Log Data Basics

    So far we have been making reference to logging and log data without providing a real concrete description of what these things are. Let’s define these now in no uncertain terms the basics around logging and log data.

    What Is Log Data?

    At the heart of log data are, simply, log messages, or logs. A log message is what a computer system, device, software, etc. generates in response to some sort of stimuli. What exactly the stimuli are greatly depends on the source of the log message. For example, Unix systems will have user login and logout messages, firewalls will have ACL accept and deny messages, disk storage systems will generate log messages when failures occur or, in some cases, when the system perceives an impending failure.

    Log data is the intrinsic meaning that a log message has. Or put another way, log data is the information pulled out of a log message to tell you why the log message generated. For example, a Web server will often log whenever someone accesses a resource (image, file, etc.) on a Web page. If the user accessing the page had to authenticate herself, the log message would contain the user’s name. This is an example of log data: you can use the username to determine who accessed a resource.

    The term logs is really used to indicate a collection of log messages that will be used collectively to paint a picture of some occurrence.

    Log messages can be classified into the following general categories:

     Informational: Messages of this type are designed to let users and administrators know that something benign has occurred. For example, Cisco IOS will generate messages when the system is rebooted. Care must be taken, however. If a reboot, for example, occurs out of normal maintenance or business hours, you might have reason to be alarmed. Subsequent chapters in this book will provide you with the skills and techniques to be able to detect when something like this occurs.

     Debug: Debug messages are generally generated from software systems in order to aid software developers troubleshoot and identify problems with running application code.

     Warning: Warning messages are concerned with situations where things may be missing or needed for a system, but the absence of which will not impact system operation. For example, if a program isn’t given the proper number of command line arguments, but yet it can run without them, is something the program might log just as a warning to the user or operator.

     Error: Error log messages are used to relay errors that occur at various levels in a computer system. For example, an operating system might generate an error log when it cannot synchronize buffers to disk. Unfortunately, many error messages only give you a starting point as to why they occurred. Further investigation is often required in order to get at the root cause of the error. Chapters 7, 8, 9, 10, 11, 12, 13, 15, and 16 in this book will provide you with ways to deal with this.

     Alert: An alert is meant to indicate that something interesting has happened. Alerts, in general, are the domain of security devices and security-related systems, but this is not a hard and fast rule. An Intrusion Prevention System (IPS) may sit in-line on a computer network, examining all inbound traffic. It will make a determination on whether or not a given network connection is allowed through based on the contents of the packet data. If the IPS encounters a connection that might be malicious it can take any number of pre-configured actions. The determination, along with the action taken, will be logged.

    We will now turn to a brief discussion of how log data is transmitted and collected. Then we will discuss what constitutes a log message.

    How is Log Data Transmitted and Collected?

    Log data transmission and collection is conceptually simple. A computer or device implements a logging subsystem whereby it can generate a message anytime it determines it needs to. The exact way the determination is made depends on the device. For example, you may have the option to configure the device or the device may be hard coded to generate a pre-set list of messages. On the flip side, you have to have a place where the log message is sent and collected. This place is generally referred to as a loghost. A loghost is a computer system, generally a Unix system or Windows server, where log messages are collected in a central location. The advantages to using a central log collector are as follows:

     It’s a centralized place to store log messages from multiple locations.

     It’s a place to store backup copies of your logs.

     It’s a place where analysis can be performed on you log data.

    While this is all well and good, how are log messages transmitted in the first place? The most common way is via the Syslog protocol. The Syslog protocol is a standard for log message interchange. It is commonly found on Unix systems, but it exists for Windows and other non-Unix based platforms. But basically there is a client and server component implemented over the User Datagram Protocol (UDP), although many open source and commercial Syslog implementations also support the Transmission Control Protocol (TCP) for guaranteed delivery. The client portion is the actual device or computer system that generates and sends log messages. The server side would typically be found on a log collection server. Its main job is to take receipt of Syslog-based log messages and store them to local disk storage where they can be analyzed, backed up and stored for long-term use.

    Syslog is not the only mechanism for log data transmission and collection. For example, Microsoft implements their own logging system for Windows. It is called the Windows Event Log. Things like user login and logoffs, application messages, and so on are stored in a proprietary storage format. There are open source and commercial applications that run on top of the Event Log which will convert event log entries to Syslog, where they are forwarded to a Syslog server. We will discuss the Windows Event Log in a little more detail in Chapters 3 and 16.

    The Simple Network Management Protocol (SNMP) is a standards based protocol for managing networked devices. The protocol is based on two concepts: traps and polling. A trap is merely a form of log message that a device or computer system emits whenever something has happened. A trap is sent to a management station, which is analogous to a loghost. A management station is used to manage SNMP-based systems. Polling is where the management station is able use SNMP to query a device for pre-defined variables such as interface statistics, bytes transferred in and out on an interface, etc. A key differentiator between SNMP and Syslog is that SNMP is supposed to be structured with respect to data format. But this not always found in practice. If you would like to learn more about SNMP, see Essential SNMP (Mauro & Schmidt, 2005).

    Databases have become a convenient way for applications to store log messages. Instead of generating a Syslog message, an application can write its log messages to a database schema. Or in some cases, the Syslog server itself can write directly a relational database. This has great advantages, especially around providing a structured way to store, analyze and report on log messages.

    Finally, there are proprietary logging formats. These are third-party devices and applications which implement their own proprietary mechanisms for generating and retrieving log messages. In this realm the vendor either provides you with an Application Programming Interface (API) in the form of C or Java libraries, or you are left to implement the protocol on your own. The Windows Event Log can be seen as a proprietary format, but it is often times viewed as an unofficial logging standard, similar to Syslog, because it is so prevalent.

    Some of the more common protocols we have discussed in this section:

     Syslog: UDP-based client/server protocol. This is the most common and prevalent mechanism for logging.

     SNMP: SNMP was originally created for use in managing networked devices. However, over the years, many non-networked systems have adopted SNMP as a way to emit log message and other status type data.

     Windows Event Log: Microsoft’s proprietary logging format.

     Database: Structured way to store and retrieve log messages.

     Common Proprietary Protocols:

     LEA: The Log Extraction API (LEA) is Checkpoint’s API for gathering logs from its line of firewall and security products.

     SDEE: The Security Device Event Exchange (SDEE) is Cisco’s eXtensible Markup Language (XML)-based protocol for gathering log messages from its line of IPS

    Enjoying the preview?
    Page 1 of 1