4.3 Where Data is Stored

4.3 Where Data is Stored

Weight: 3

Description: Where various types of information are stored on a Linux system.

Key Knowledge Areas:

  • Programs and configuration

  • Processes

  • Memory addresses

  • System messaging

  • Logging

The following is a partial list of the used files, terms and utilities:

  • ps, top, free

  • syslog, dmesg

  • /etc/, /var/log/

  • /boot/, /proc/, /dev/, /sys/

How a computer uses memory

This image shows a simple hierarchical view of the main components of computer architecture. At the top are the input devices, such as the mouse, keyboard, scanner, camera, microphone, and video input, whose job is to transfer data and commands from the user to the system. After entering the system, this data is usually sent to lower layers for processing.

In the middle section, storage devices are shown, including read-only memory (ROM), removable storage, hard drives, and network storage. These types of memory provide permanent storage, meaning the data remains even after the system is turned off. Below this part is RAM, which acts as temporary storage and includes both physical RAM and virtual memory. This is where programs keep their data while running, and it offers much faster access compared to permanent storage.

At the bottom are the cache and CPU. The cache is the fastest type of memory and stores the data the CPU needs most often, while the CPU is the central unit that processes and executes instructions. The vertical arrow on the left represents the data access speed hierarchy: the closer we are to the CPU, the faster (but smaller) the memory; the further we move toward permanent storage, the larger but slower it becomes. This structure shows how different components work together to transfer and process data in a computer system.

free

The `free` command in Linux is the one that facilitates with providing the overview of system memory utilization. It displays all the details regarding the RAM usage such as how is the total, what is used, and free memory including buffers and cached data, aiding in real-time monitoring of memory resources. It acts as essential command for administrators and users to assess system performance, allocate resources effectively, and identify potential memory-related issues promptly.

Syntax:

The basic syntax of the "free" command is as follows:

Basic Usage of 'free' Command

Free command without any option shows the used and free space of swap and physical memory in KB

When no option is used then free command produces the columnar output as shown above where column:

  1. total displays the total installed memory (MemTotal and SwapTotal i.e present in /proc/meminfo).

  2. used displays the used memory.

  3. free displays the unused memory.

  4. shared displays the memory used by tmpfs(Shmen i.e. present in '/proc/meminfo' and displays zero in case not available).

  5. buffers displays the memory used by kernel buffers.

  6. cached displays the memory used by the page cache and slabs(Cached and Slab available in '/proc/meminfo').

  7. buffers/cache displays the sum of buffers and cache.

Common Options of 'free' Command

Options

Description

-k, --kilo

Displays memory usage in kilobytes (default).

-m, --mega

Displays memory usage in megabytes.

-g, --giga

Displays memory usage in gigabytes.

--tera

Displays memory usage in terabytes.

-h, --human

Automatically scales all output columns to the shortest three-digit unit and displays the units (B, K, M, G, T).

-c, --count

Displays the output 'c' number of times; works with the -s option.

-l, --lohi

Shows detailed low and high memory statistics.

-o, --old

Disables the display of the buffer-adjusted line.

-s, --seconds

Continuously displays the output after 's' seconds delay. Uses the usleep system call for microsecond resolution delay times.

-t, --total

Adds an additional line in the output showing column totals.

--help

Displays a help message and exits.

-V, --version

Displays version information and exits.

List Running Processes in Linux

We can use multiple commands to list the running processes in Linux like ps, top, htop,and atop commands in Linux. We can also have a combination of commands to list the running processes in Linux.

ps

The ps command in Linux is used to display information about the currently running processes on the system.

  • ps stands for process status.

  • It shows details like PID, user, CPU, memory usage, and the command that started the process.

  • By default, it displays processes running in the current shell.

  • Use options to view more detailed or system-wide process information.

  • Common formats include standard (ps), user-based (ps -u), and full system (ps -ef or ps aux).

  • Often combined with grep to find specific processes.

  • Useful for monitoring and troubleshooting running applications and services.

syntax:

The ps command provides a snapshot of the current processes on your system. The basic syntax is as follows:

Without any options, `ps` displays information about the processes associated with the current terminal session. However, to harness the full potential of the `ps` command, various options can be used to customize the output.

Result contains four columns of information. Where,

  • PID - the unique process ID

  • TTY - terminal type that the user is logged into

  • TIME - amount of CPU in minutes and seconds that the process has been running

  • CMD - name of the command that launched the process.

ps command options

Some commonly used options:

Options

Description

a

List all ruining processes for all users.

-A, -e

Lists all processes on the entire system, offering a complete overview of running tasks and programs.

-a

List all processes except session leaders (instances where the process ID is the same as the session ID) and processes not associated with a terminal.

-d

Lists all processes except session leaders, providing a filtered view of processes running on the system.

--deselect, -N

Lists all processes except those that meet specific user-defined conditions.

f

Displays the hierarchy of processes in a visual ASCII art format, illustrating parent-child relationships.

-j

Presents the output in the jobs format, providing detailed information such as process ID, session ID, and command.

T

Lists all processes associated with the current terminal, aiding in focusing on tasks related to a specific terminal.

r

Only lists running processes, useful for monitoring system performance.

u

Expands the output to include additional information like CPU and memory usage.

-u

Specifies a username, listing processes associated with that user.

x

Includes processes without a TTY, showing background processes not tied to a specific terminal session.

top

the `top` (table of processes) command is a dynamic and interactive tool that provides real-time information about system processes. It offers a comprehensive view of running processes, system resource utilization, and other critical system metrics. This article explores how to effectively use the top command to monitor and manage processes.

Launching top

Hoe to Intrepret top Command Output

The top command output is divided into several sections, each with specific information about system performance and processes. This section provides a breakdown of the output based on the information it shows.

Uptime

When you first open the top command, the initial line, often referred to as the header or summary line, displays information similar to what you see when you use the uptime command. It shows:

  1. System Time: The current time on the system.

  2. Uptime: How long the system has been running since the last boot.

  3. Users: The number of users currently logged into the system.

  4. Load Average: This is shown in three numbers separated by commas. These numbers represent the average number of processes waiting for CPU time over the last 1 minute, 5 minutes, and 15 minutes, respectively. A value of 1.0 means the system's CPU is fully utilized; higher values indicate potential overloading.

tasks

  • total: Indicates the total count of processes currently being tracked by the system.

  • running: Represents the number of processes currently actively using CPU time.

  • sleeping: Refers to processes that are currently idle and waiting for a signal to wake up.

  • stopped: Denotes processes that have been manually stopped, typically through a signal.

  • zombie: Indicates processes that have completed execution but still have an entry in the process table.

%Cpu(s):

The %Cpu(s) line in the top command provides information about CPU usage and statistics on a Linux system. It typically includes:

  • us: Percentage of CPU time spent running user processes.

  • sy: Percentage of CPU time spent running kernel (system) processes.

  • ni: Percentage of CPU time spent running processes with a nice value (priority adjusted).

  • id: Percentage of CPU time spent idle (no work being done).

  • wa: Percentage of CPU time spent waiting for I/O operations to complete.

  • hi: Percentage of CPU time spent servicing hardware interrupts.

  • si: Percentage of CPU time spent servicing software interrupts.

  • st: Percentage of CPU time stolen from this virtual machine by the hypervisor (if virtualized).

MiB Memory:

The "MiB Memory" line in the top command provides information about memory usage and statistics on a Linux system. It typically includes:

  • total: Total amount of physical memory (RAM) available in MiB.

  • used: Amount of RAM currently in use by processes and the kernel.

  • free: Amount of RAM not being used at all.

  • buff/cache: Amount of memory used for buffering data and caching filesystems.

Mib Swap:

The "MiB Swap" line in the top command provides information about swap usage and statistics on a Linux system. It typically includes:

  • total: Total amount of swap space available in MiB.

  • used: Amount of swap space currently in use.

  • free: Amount of swap space that is not being used.

  • available: Estimate of how much memory is available for starting new applications without swapping.


Linux Directory Structure

In Linux, everything is treated as a file even if it is a normal file, a directory, or even a device such as a printer or keyboard. All the directories and files are stored under one root directory which is represented by a forward slash /.

  • The Linux directory layout follows the Filesystem Hierarchy Standard (FHS).

  • This standard defines how directories are organized and what types of files should be stored in each.

  • Since Linux is based on UNIX, it inherits many of UNIX’s filesystem conventions.

  • Similar directory structures are also found in other UNIX-like operating systems such as BSD and macOS.

We know that in a Windows-like operating system, files are stored in different folders on different data drives like C: D: E:, whereas in the Linux/Unix operating system, files are stored in a tree-like structure starting with the root directory, as shown in the diagram below.

The Linux/Unix file system hierarchy base begins at the root and everything starts with the root directory.

Top-level directories associated with the root directory

These top-level directories under the root (/) form the foundation of the Linux file system, each serving a specific role in organizing system files, user data, and configurations.

Linux File System Structure

The architecture of a file system comprises three layers mentioned below.

1. Logical File System:

The Logical File System acts as the interface between the user applications and the file system itself. It facilitates essential operations such as opening, reading, and closing files. Essentially, it serves as the user-friendly front-end, ensuring that applications can interact with the file system in a way that aligns with user expectations.

2. Virtual File System:

The Virtual File System (VFS) is a crucial layer that enables the concurrent operation of multiple instances of physical file systems. It provides a standardized interface, allowing different file systems to coexist and operate simultaneously. This layer abstracts the underlying complexities, ensuring compatibility and cohesion between various file system implementations.

3. Physical File System:

The Physical File System is responsible for the tangible management and storage of physical memory blocks on the disk. It handles the low-level details of storing and retrieving data, interacting directly with the hardware components. This layer ensures the efficient allocation and utilization of physical storage resources, contributing to the overall performance and reliability of the file system.

Together, these layers form a cohesive architecture, orchestrating the organized and efficient handling of data in the Linux operating system.

Virtual File Systems

/dev

The /dev/ directory consists of files that represent devices that are attached to the local system. However, these are not regular files that a user can read and write to; these files are called devices files or special files:

Device files are abstractions of standard devices that applications interact with via I/O system calls. The device files that correspond to hardware devices fall into two main categories. Mainly character special files and block special files.

3. The Difference Between Character Special Files and Block Special Files?

/proc

Proc file system (procfs) is a virtual file system created on the fly when the system boots and is dissolved at the time of system shutdown. It contains useful information about the processes that are currently running, it is regarded as a control and information center for the kernel. The proc file system also provides a communication medium between kernel space and user space.

To List all the files and directories under the `/proc` directory.

This command will list all the files and directories under the /proc directory with detailed information like permissions, ownership, size, and time of modifications. This information is useful for understanding the current state of our system and diagnosing problems that are related to the running processes.

/sys

/sys is another virtual directory like /proc and /dev and also contains information from devices connected to your computer.

Key Differences Between /proc and /sys

Feature

/proc (procfs)

/sys (sysfs)

Purpose

Process and system runtime information

Kernel and hardware interaction

Type

Virtual filesystem (procfs)

Virtual filesystem (sysfs)

Content

Process details, kernel parameters, system stats

Hardware devices, kernel subsystems, driver configurations

Read/Write

Mostly read-only (except /proc/sys/)

Allows modifying hardware and kernel settings

Example File

/proc/cpuinfo (CPU details)

/sys/class/net/eth0/address (MAC address of eth0)

Main Use

Monitoring and debugging system state

Configuring kernel and hardware parameters

Physical File Syetems

Top-level directories

Directories

Description

/etc

system configuration files.

/home

home directory. It is the default current directory.

/opt

optional or third-party software.

/tmp

temporary space, typically cleared on reboot.

/usr

User related programs.

/var

log files.

other directories in the Linux system:

Directories

Description

/bin

binary or executable programs. Needed for system

/usr/bin

Most Programs

/sbin /usr/sbin

System Config tools

/usr/share/bin

Programs for other apps, like Nginx,Squic, ...


Linux Logging Basics

Operating system logs provide a wealth of diagnostic information about your computers, and Linux is no exception. Everything from kernel events to user actions is logged by Linux, allowing you to see almost any action performed on your servers. In this guide, we’ll explain what Linux logs are, where they’re located, and how to interpret them.

/var/log

Linux has a special directory for storing logs called /var/log. This directory contains logs from the OS itself, services, and various applications running on the system. Here’s what this directory looks like on a typical Ubuntu system.

Some of the most important Linux system logs include:

  • /var/log/syslog and /var/log/messages store all global system activity data, including startup messages. Debian-based systems like Ubuntu store this in /var/log/syslog, while Red Hat-based systems like RHEL or CentOS use /var/log/messages.

  • /var/log/auth.log and /var/log/secure store all security-related events such as logins, root user actions, and output from pluggable authentication modules (PAM). Ubuntu and Debian use /var/log/auth.log, while Red Hat and CentOS use /var/log/secure.

  • /var/log/kern.log stores kernel events, errors, and warning logs, which are particularly helpful for troubleshooting custom kernels.

  • /var/log/cron stores information about scheduled tasks (cron jobs). Use this data to verify your cron jobs are running successfully.

Some applications also write log files to this directory. For example, the Apache web server writes logs to the /var/log/apache2 directory (on Debian), while MySQL writes logs to the /var/log/mysql directory. Some applications also log via Syslog, which we’ll explain in the next section.

syslog

Syslog is a standard for creating and transmitting logs. The word “syslog” can refer to any of the following:

  1. The syslog service receives and processes syslog messages and listens for events by creating a socket located at /dev/log, which applications can write to. It can write messages to a local file or forward messages to a remote server. There are different syslog implementations, including rsyslogd and syslog-ng.

  2. The Syslog protocol (RFC 5424) is a transport protocol specifying how to transmit logs over a network. It’s also a data format defining how messages are structured. By default, it uses port 514 for plaintext messages and port 6514 for encrypted messages.

  3. A syslog message is any log formatted in the syslog message format and consists of a standardized header and message containing the log’s contents.

Since Syslog can forward messages to remote servers, it’s often used to forward system logs to log management solutions.

Syslog Daemons difference

Depending on the distribution or system you use, you may get a different open source syslog daemon by default. Some prefer the standard syslogd, others go with rsyslog and others install syslog-ng by default. All of them do what you expect, and manage the syslog process and calls within the server.

Name
Description
Config file

syslogd

The first one, originally created in the 80's to handle the syslog protocol. It is still the default on OpenBSD.

/etc/syslog.conf

syslog-ng

Created in the late 90's as a robust replacement to to syslogd. Added support for TCP, encryption and many other features. Syslog-ng was the standard and included on Suse, Debian and Fedora for many years.

/etc/syslog-ng/syslog-ng.conf

rsyslog:

Created in 2004 as a competitor to syslog-ng, because the default syslog daemon on Ubuntu, RHEL and many other distributions. If you have a common and updated Linux distribution, you are likely using rsyslog.

/etc/rsyslog.conf

Logging with systemd

Many Linux distributions ship with systemd—a process and service manager. Systemd implements its own logging service called journald, which can replace or complement Syslog. Journald logs in a significantly different manner than systemd, which is why it has its own section in LPIC cources.

dmesg

dmesg command also called “driver message” or “display message” is used to examine the kernel ring buffer and print the message buffer of the kernel. The output of this command contains the messages produced by the device drivers.

When the computer boots up, there are lot of messages(log) generated during the system start-up. So you can read all these messages by using dmesg command. The contents of the kernel ring buffer are also stored in '/var/log/dmesg' file.

The dmesg command can be useful when the system encounters any problem during its start-up, so by reading the contents of dmesg command you can find out where the problem occurred(as there are many steps in the system boot-up sequence).

Syntax:

Common Options for the dmesg Command

Option

Description

-C, --clear

Clears the kernel ring buffer.

-c, --read-clear

Prints the contents of the buffer and then clears it.

-D, --console-off

Disables printing of kernel messages to the console.

-E, --console-on

Enables printing of kernel messages to the console.

-F, --file <file>

Reads kernel messages from the specified file.

-h, --help

Displays help text for dmesg and its options.

-k, --kernel

Prints only kernel messages.

-t, --notime

Suppresses timestamps in the output.

-u, --userspace

Prints userspace messages.

That's all.

.

.

.


sources:

https://www.geeksforgeeks.org/linux-unix/free-command-linux-examples/ https://www.geeksforgeeks.org/linux-unix/ps-command-in-linux-with-examples/ https://www.geeksforgeeks.org/linux-unix/top-command-in-linux-with-examples/ https://www.geeksforgeeks.org/linux-unix/linux-directory-structure/ https://opensource.com/article/19/3/virtual-filesystems-linux https://www.baeldung.com/linux/dev-directory https://www.geeksforgeeks.org/linux-unix/proc-file-system-linux/ https://www.geeksforgeeks.org/linux-unix/linux-file-system/ https://itprohelper.com/differences-between-proc-and-sys-in-linux/#:~:text=Key%20Differences%20Between%20%2Fproc%20and%20%2Fsys&text=Use%20%2Fproc%20to%20monitor%20system,hardware%20and%20configure%20kernel%20settings. https://www.loggly.com/ultimate-guide/linux-logging-basics/#:~:text=Some%20of%20the%20most%20important,%2Fvar%2Flog%2Fmessages%20. https://www.geeksforgeeks.org/linux-unix/dmesg-command-linux-driver-messages/

Last updated