3.2 Searching and Extracting Data from Files

Weight: 3

Description: Search and extract data from files in the home directory.

Key Knowledge Areas:

Command line pipes
I/O redirection
Basic Regular Expressions using ., [ ], *, and ?

The following is a partial list of the used files, terms and utilities:

grep
less
cat, head, tail
sort
cut
wc

cat

The cat command in Linux is one of the most frequently used commands in Unix-like operating systems. It stands for “concatenate” and is primarily used to read, display, and concatenate text files.

Primarily used to read and display the contents of files on the terminal.
Can concatenate multiple files and display them as a single continuous output.
Allows users to create new files or append data to existing ones.
Useful for quick file inspection and merging without opening a text editor.

View the Content of a Single File using cat

The most basic use of 'cat' is to display the contents of a file on the terminal. This can be achieved by simply providing the filename as an argument:

Syntax:

cat file_name

Example: If our file_name = output.txt

cat output.txt

streams

A stream is nothing more than a sequence of bytes that is passed from one file, device, or program to another.

In Linux, a stream is a fundamental concept for handling input, output, and communication between processes. At its core, a stream represents a sequence of bytes that can be read from or written to. Streams provide a uniform interface for data transfer and processing across various input/output operations.

These streams are:

standard input stream (stdin), which provides input to commands.
standard output stream (stdout), which displays output from commands.
standard error stream (stderr), which displays error output from commands.

The streams are also numbered: stdin (0) ,stdout (1), stderr (2).

I/O Redirection

Linux includes redirection commands for each stream. These can be used to write standard output or standard error to a file. If you write to a file that does not exist, a new file with that name will be created prior to writing.

Commands with a single bracket overwrite the destination’s existing contents.

Overwrite

> - standard output
< - standard input
2> - standard error

Commands with a double bracket do not overwrite the destination’s existing contents.

Append

>> - standard output
<< - standard input
2>> - standard error

Examples:

[payam@earth Working]$ echo "hello"
hello
[payam@earth Working]$ echo "hello" > output.txt
[payam@earth Working]$ cat output.txt 
hello
[payam@earth Working]$ echo "how are you?" > output.txt 
[payam@earth Working]$ cat output.txt 
how are you?
[payam@earth Working]$ echo "I'm fine, thank you" >> output.txt 
[payam@earth Working]$ cat output.txt 
how are you?
I'm fine, thank you
[payam@earth Working]$

[payam@earth Working]$ cat Blahblah.txt
cat: Blahblah.txt: No such file or directory
[payam@earth Working]$ cat Blahblah.txt > result.txt
cat: Blahblah.txt: No such file or directory
[payam@earth Working]$ cat result.txt 
[payam@earth Working]$ cat Blahblah.txt > result.txt 2>error.txt
[payam@earth Working]$ cat error.txt 
cat: Blahblah.txt: No such file or directory

piping with |

A pipe is a form of redirection (transfer of standard output to some other destination) that is used in Linux and other Unix-like operating systems to send the output of one command/program/process to another command/program/process for further processing. The Unix/Linux systems allow the stdout of a command to be connected to the stdin of another command. You can make it do so by using the pipe character '|'. (found above the backslash \ key on most keyboards). The pipe is used to combine two or more commands, and in this, the output of one command acts as input to another command, and this command's output may act as input to the next command, and so on. It can also be visualized as a temporary connection between two or more commands/ programs/ processes. The command line programs that do the further processing are referred to as filters. This direct connection between commands/ programs/ processes allows them to operate simultaneously and permits data to be transferred between them continuously rather than having to pass it through temporary text files or through the display screen. Pipes are unidirectional i.e., data flows from left to right through the pipeline.

command1 | command2

Either command can have options or arguments. We can also use | to redirect the output of the second command in the pipeline to a third command, and so on.

command 1 | command 2 | command3 | command 4 | ...

View Kernel Messages in Linux

dmesg command also called “driver message” or “display message” is used to examine the kernel ring buffer and print the message buffer of the kernel. The output of this command contains the messages produced by the device drivers.

[payam@earth Working]$ dmesg 
[    0.000000] Linux version 5.14.0-611.9.1.el9_7.x86_64 (mockbuild@iad1-prod-build001.bld.equ.rockylinux.org) (gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-11), GNU ld version 2.35.2-67.el9) #1 SMP PREEMPT_DYNAMIC Tue Nov 25 17:53:21 UTC 2025
[    0.000000] The list of certified hardware and cloud instances for Enterprise Linux 9 can be viewed at the Red Hat Ecosystem Catalog, https://catalog.redhat.com.
[    0.000000] Command line: BOOT_IMAGE=(hd0,gpt1)/vmlinuz-5.14.0-611.9.1.el9_7.x86_64 root=/dev/mapper/vg--os-lv--root ro resume=/dev/mapper/vg--os-lv--swap rd.lvm.lv=vg-os/lv-root rd.lvm.lv=vg-os/lv-swap rhgb quiet crashkernel=1G-2G:192M,2G-64G:256M,64G-:512M
[    0.000000] x86/split lock detection: #AC: crashing the kernel on kernel split_locks and warning on user-space split_locks
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009efff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009f000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000039f98fff] usable
[    0.000000] BIOS-e820: [mem 0x0000000039f99000-0x000000003a898fff] reserved
[    0.000000] BIOS-e820: [mem 0x000000003a899000-0x00000000434aefff] usable
[    0.000000] BIOS-e820: [mem 0x00000000434af000-0x00000000452fefff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000452ff000-0x0000000045b2efff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x0000000045b2f000-0x0000000045bfefff] ACPI data
[    0.000000] BIOS-e820: [mem 0x0000000045bff000-0x0000000045bfffff] usable
[    0.000000] BIOS-e820: [mem 0x0000000045c00000-0x0000000049ffffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000004a200000-0x000000004a3fffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000004b000000-0x00000000503fffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fe010000-0x00000000fe010fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed20000-0x00000000fed7ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x00000004afbfffff] usable

now lets redirct dmesg out put to less command input :

dmesg | less

[payam@earth Working]$ dmesg 
[    0.000000] Linux version 5.14.0-611.9.1.el9_7.x86_64 (mockbuild@iad1-prod-build001.bld.equ.rockylinux.org) (gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-11), GNU ld version 2.35.2-67.el9) #1 SMP PREEMPT_DYNAMIC Tue Nov 25 17:53:21 UTC 2025
[    0.000000] The list of certified hardware and cloud instances for Enterprise Linux 9 can be viewed at the Red Hat Ecosystem Catalog, https://catalog.redhat.com.
[    0.000000] Command line: BOOT_IMAGE=(hd0,gpt1)/vmlinuz-5.14.0-611.9.1.el9_7.x86_64 root=/dev/mapper/vg--os-lv--root ro resume=/dev/mapper/vg--os-lv--swap rd.lvm.lv=vg-os/lv-root rd.lvm.lv=vg-os/lv-swap rhgb quiet crashkernel=1G-2G:192M,2G-64G:256M,64G-:512M
[    0.000000] x86/split lock detection: #AC: crashing the kernel on kernel split_locks and warning on user-space split_locks
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009efff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009f000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000039f98fff] usable
[    0.000000] BIOS-e820: [mem 0x0000000039f99000-0x000000003a898fff] reserved
[    0.000000] BIOS-e820: [mem 0x000000003a899000-0x00000000434aefff] usable
[    0.000000] BIOS-e820: [mem 0x00000000434af000-0x00000000452fefff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000452ff000-0x0000000045b2efff] ACPI NVS
[    0.000000] BIOS-e820: [mem 0x0000000045b2f000-0x0000000045bfefff] ACPI data
[    0.000000] BIOS-e820: [mem 0x0000000045bff000-0x0000000045bfffff] usable
[    0.000000] BIOS-e820: [mem 0x0000000045c00000-0x0000000049ffffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000004a200000-0x000000004a3fffff] reserved
[    0.000000] BIOS-e820: [mem 0x000000004b000000-0x00000000503fffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fe010000-0x00000000fe010fff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000fed20000-0x00000000fed7ffff] reserved
[    0.000000] BIOS-e820: [mem 0x00000000ff000000-0x00000000ffffffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000100000000-0x00000004afbfffff] usable

this way we have more control over reading logs using less command options.

Filters

Filters are are a class of programs that are commonly used with output piped from another program. Many of them are also useful on their own, but they illustrate piping behavior especially well.

grep - returns text that matches the string pattern passed to grep.
head - is used to display the first few lines of one or more text files directly in the terminal
tail- is used to display the last part of a file, showing recent content such as logs or updates.
sort- used to sort a file, arranging the records in a particular order.
wc - counts characters, lines, and words.

grep

Grep, short for “global regular expression print”, is one of the most useful tools in Linux and Unix systems. It is used to search for specific words, phrases, or patterns inside text files, and shows the matching lines on your screen.

grep Command is useful when you need to quickly find certain keywords or phrases in logs or documents. Let’s consider an example:

Search for a word in a file

If you have a file called notes.txt and you want to find all lines containing the word Python, you can use:

grep "python" notes.txt

Syntax :

The basic syntax of the `grep` command is as follows:

grep [options] pattern [files]

[options]: These are command-line flags that modify the behavior of grep.
[pattern]: This is the regular expression you want to search for.
[file]: This is the name of the file(s) you want to search within. You can specify multiple files for simultaneous searching.

Commonly Used `grep` Options

Option

What It Does

Example Command

-i

Case insensitive search

grep -i myfile.txt

-c

Displaying the Count Matches

grep -c "unix" myfile.txt

-l

Display the Matching Filenames

grep -l "unix" *

grep -l "unix" f1.txt f2.txt f3.xt f4.txt

-w

Checking Whole Words : By default, grep matches the given string/pattern even if it is found as a substring in a file. The -w option to grep makes it match only the whole words.

grep -w "unix" myfile.txt

-o

Display Matched Pattern: By default, grep displays the entire line which has the matched string. We can make the grep to display only the matched string by using the -o option

grep -o "unix" myfile.txt

-n

Show Line Numbers

grep -n "unix" myfile.txt

-v

Inverting the Pattern Match: You can display the lines that are not matched with the specified search string pattern using the -v option.

grep -v "unix" myfile.txt

Regular Expressions

Regexps are acronyms for regular expressions(Regex). Regular expressions are special characters or sets of characters that help us to search for data and match the complex pattern. Regexps are most commonly used with the Linux commands: grep, sed, tr, vi.

The following are some basic regular expressions:

Symbol

Description

It is called a wild card character, It matches any one character other than the new line.

It matches the start of the string.

It matches the end of the string.

It matches up to zero or more occurrences i.e. any number of times of the character of the string.

It is used for escape following character.

()

It is used to match or search for a set of regular expressions.

It matches exactly one character in the string or stream.

[ ]

Matches any one of a set characters

[ - ]

Matches any one of a range characters

Globbing and Regex: So Similar, So Different

Beginners sometimes tend to confuse wildcards(globbing) with regular expressions when using grep but they are not the same. Wildcards are a feature provided by the shell to expand file names whereas regular expressions are a text filtering mechanism intended for use with utilities like grep, sed and awk.

grep supports regex for advanced searching:

Command

Description

grep "^unix" myfile.txt

Match Lines Starting with a string

grep "os$" myfile.txt

Match Lines Ending with a String

double quotes " " : Also we need to put our extended regex between double quotes, other wise it might be interpreted by shell and gives us different results.

In order to avoid any mistake while using extended regular expressions, use grep with -E option, -E treats pattern as an extended regular expression(ERE).

regex

match

echo "aa ab ba aaa bbb AB BA" | grep -E "a*b"

aa ab ba aaa bbb AB BA

echo "aa ab ba aaa bbb AB BA" | grep -E "a.b"

aa ab ba aaa bbb AB BA

echo "aa ab ba aaa bbb AB BA" | grep -E "a?b"

aa ab ba aaa bbb AB BA

egrep

egrep is a pattern searching command which belongs to the family of grep functions. It works the same way as grep -E does. It treats the pattern as an extended regular expression and prints out the lines that match the pattern. If there are several files with the matching pattern, it also displays the file names for each line.

Copy

egrep [ options ] 'PATTERN' files

Options: Most of the options for this command are same as grep.

So instead of using grep -E command in above we can use egrep easily.

Head and Tail Commands

head

The head command in Linux is used to display the first few lines of one or more text files directly in the terminal.

The head command reads a file and prints the top portion (default is the first 10 lines) to standard output.
It’s commonly used when you want to quickly preview the beginning of a file without opening it in an editor.
It supports options to specify the number of lines or bytes to display.
You can use it with multiple files at once to view the first lines of each.

the basic head command to display the first 10 lines of the sample.txt file:

head sample.txt

example:

[payam@earth Working]$ dmesg | head
[    0.000000] Linux version 5.14.0-611.9.1.el9_7.x86_64 (mockbuild@iad1-prod-build001.bld.equ.rockylinux.org) (gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-11), GNU ld version 2.35.2-67.el9) #1 SMP PREEMPT_DYNAMIC Tue Nov 25 17:53:21 UTC 2025
[    0.000000] The list of certified hardware and cloud instances for Enterprise Linux 9 can be viewed at the Red Hat Ecosystem Catalog, https://catalog.redhat.com.
[    0.000000] Command line: BOOT_IMAGE=(hd0,gpt1)/vmlinuz-5.14.0-611.9.1.el9_7.x86_64 root=/dev/mapper/vg--os-lv--root ro resume=/dev/mapper/vg--os-lv--swap rd.lvm.lv=vg-os/lv-root rd.lvm.lv=vg-os/lv-swap rhgb quiet crashkernel=1G-2G:192M,2G-64G:256M,64G-:512M
[    0.000000] x86/split lock detection: #AC: crashing the kernel on kernel split_locks and warning on user-space split_locks
[    0.000000] BIOS-provided physical RAM map:
[    0.000000] BIOS-e820: [mem 0x0000000000000000-0x000000000009efff] usable
[    0.000000] BIOS-e820: [mem 0x000000000009f000-0x00000000000fffff] reserved
[    0.000000] BIOS-e820: [mem 0x0000000000100000-0x0000000039f98fff] usable
[    0.000000] BIOS-e820: [mem 0x0000000039f99000-0x000000003a898fff] reserved
[    0.000000] BIOS-e820: [mem 0x000000003a899000-0x00000000434aefff] usable

Syntax:

head [options] [file(s)]

If no file name is specified, head reads from standard input (stdin)

Head command common options:

Option

Long-Form

Description

-n

--lines

show the specified number of lines

-c

--bytes

show the specified number of bytes

-v

--verbose

show the file name tag

-q

--quiet

don't separate the content of multiple files with a file name tag

example:

[payam@earth Working]$ dmesg | head -n 5
[    0.000000] Linux version 5.14.0-611.9.1.el9_7.x86_64 (mockbuild@iad1-prod-build001.bld.equ.rockylinux.org) (gcc (GCC) 11.5.0 20240719 (Red Hat 11.5.0-11), GNU ld version 2.35.2-67.el9) #1 SMP PREEMPT_DYNAMIC Tue Nov 25 17:53:21 UTC 2025
[    0.000000] The list of certified hardware and cloud instances for Enterprise Linux 9 can be viewed at the Red Hat Ecosystem Catalog, https://catalog.redhat.com.
[    0.000000] Command line: BOOT_IMAGE=(hd0,gpt1)/vmlinuz-5.14.0-611.9.1.el9_7.x86_64 root=/dev/mapper/vg--os-lv--root ro resume=/dev/mapper/vg--os-lv--swap rd.lvm.lv=vg-os/lv-root rd.lvm.lv=vg-os/lv-swap rhgb quiet crashkernel=1G-2G:192M,2G-64G:256M,64G-:512M
[    0.000000] x86/split lock detection: #AC: crashing the kernel on kernel split_locks and warning on user-space split_locks
[    0.000000] BIOS-provided physical RAM map:

tail

Tail Command in Linux is used to display the last part of a file, showing recent content such as logs or updates.

By default, it shows the last 10 lines of a file.
Commonly used for monitoring log files and debugging.
You can customize the number of lines displayed using the -n option.
Useful for viewing the most recent entries without opening the entire file.

Without any option it display only the last 10 lines of the file specified:

tail myfile.txt

another example:

[payam@earth Working]$ dmesg | tail 
[  106.485984] Bluetooth: RFCOMM socket layer initialized
[  106.485996] Bluetooth: RFCOMM ver 1.11
[  122.145191] rfkill: input handler enabled
[  129.412381] rfkill: input handler disabled
[  130.176701] exFAT-fs (sdb): Volume was not properly unmounted. Some data may be corrupt. Please run fsck.
[ 1370.338863] input: INK'D+ WIRELESS (AVRCP) as /devices/virtual/input/input20
[ 4323.759520] input: INK'D+ WIRELESS (AVRCP) as /devices/virtual/input/input21
[ 6955.285011] input: INK'D+ WIRELESS (AVRCP) as /devices/virtual/input/input22
[ 7067.130053] input: INK'D+ WIRELESS (AVRCP) as /devices/virtual/input/input23
[13692.287103] input: INK'D+ WIRELESS (AVRCP) as /devices/virtual/input/input24

Syntax:

tail [OPTION]... [FILE]...

tail command common options:

Short Form

Long Form

Description

-c

--bytes=[+]NUM

Shows the last NUM bytes of a file. Using + shows the bytes following from the specified NUM byte of each file.

-f

--follow[={name|descriptor}]

Monitors file for changes and outputs new data as the file grows. When no value is specified after --follow=, descriptor is used as the default value. This means that the update mode continues to run even when the file is renamed or moved. Specify the --max-unchanged-stats=N argument to reopen a [file] that has not changed size after N (default 5) iterations to check if it has been unlinked or renamed. Specify the --pid=PID argument to exit tail after the process with the PID process ID terminates.

-F

--follow= name --retry

Instructs tail to keep updating the output even if the original file is removed during the log rotation and replaced by a new one with the same name.

-n

--lines=[+]NUM

Shows the last NUM lines instead of the default 10. Using -n +NUM causes the output to start with the line NUM.

-q

--quiet, --silent

Omits the file names from the output, displaying only the contents.

-s

--sleep-interval=N

Used in combination with -f. Instructs tail to wait for N seconds (default 1) between iterations.

-v

--verbose

Makes tail always print the file name before displaying the contents.

-z

--zero-terminated

Uses NUL as the line delimiter instead of the newline character.

--help

Displays the help file.

sort

The ‘sort’ command is a Linux program used for printing lines of input text files and concatenation of all files in sorted order. Sort command takes blank space as field separator and the entire input file as the sort key. It is important to notice that the sort command doesn’t actually sort the files but only prints the sorted output until you redirect the output.

Syntax

sort [OPTION]... [FILE]...

example:

[payam@earth Working]$ cat 1.txt 
D 1
d 1
c 2
C 2
A 3
B 4
f 14

[payam@earth Working]$ sort 1.txt 
A 3
B 4
c 2
C 2
d 1
D 1
f 14

If a file has words/lines beginning with both upper case and lower case characters, then sort displays those with upper case at top. However, we can change this behavior using the -f command line option:

[payam@earth Working]$ sort -f 1.txt 
A 3
B 4
C 2
c 2
D 1
d 1
f 14

The -n option sort the contents numerically. Also we can sort a file base on "n"th column with -kn option:

[payam@earth Working]$ sort -n -k2 1.txt 
d 1
D 1
c 2
C 2
A 3
B 4
f 14

user -r to reverse the result of comparisons. Other options of sort command:

sort command common options:

Short option form

Long option form

Description

-b

--ignore-leading-blanks

Causes sort to ignore leading blanks.

-d

--dictionary-order

Causes sort to consider only blanks and alphanumeric characters.

-f

--ignore-case

Ignores the default case sorting rule and changes all lowercase letters to uppercase before comparison.

-M

--month-sort

Sorts lines according to months (Jan-Dec).

-h

--human-numeric-sort

Compares human-readable numbers (e.g., 2K 1G).

-n

--numeric-sort

Compares data according to string numerical values.

-R

--random-sort

Sorts data by a random hash of keys but groups identical keys together.

-r

--reverse

Reverses the comparison results.

--sort=WORD

Sort data according to the specified WORD: general-numeric -g, human-numeric -h, month -M, numeric -n, random -R, version -V.

-c

--check, --check=diagnose-first

Checks if the input is already sorted but doesn't sort it.

--debug

Annotates the part of the line used for sorting.

-k

--key=KEYDEF

Sort data using the specified KEYDEF, which gives the key location and type.

-m

--merge

Causes sort to merge already sorted files.

-o

--output=FILE

Redirects the output to FILE instead of printing it in standard output.

-t

--field-separator=SEP

Uses the specified SEP separator instead of non-blank to blank transition.

-z

--zero-terminated

Causes sort to use NUL as the line delimiter instead of the newline character.

--help

Displays the help file with full options list and exits.

cut

The cut command in UNIX is a command line utility for cutting sections from each line of files and writing the result to standard output. It can be used to cut parts of a line by byte position, character and delimiter.

syntax:

cut OPTION... [FILE]...

Note: If FILE is not specified, `cut` reads from standard input (stdin).

cut by byte position:

[payam@earth Working]$ echo "linux" | cut -b 1
l
[payam@earth Working]$ echo "linux" | cut -b 1,5
lx
[payam@earth Working]$ echo "linux" | cut -b 1-4
linu

cut by character:

[payam@earth Working]$  echo '♣foobar' | cut -c 1,7
♣r
[payam@earth Working]$  echo '♣foobar' | cut -c 5-7
bar

cut based on a delimiter:

To cut using a delimiter use the -d option. This is normally used in conjunction with the -f option to specify the field that should be cut. examples:

[payam@earth Working]$ cut 1.txt -d: -f1
1
2
3
4
[payam@earth Working]$cut 1.txt -d: -f2
a,w
b,x
c,y
d,z
[payam@earth Working]$ cut 1.txt -d, -f1
1:a
2:b
3:c
4:d
[payam@earth Working]$ cut 1.txt -d, -f2
w
x
y
z

Options Available in cut Command

Here is a list of the most commonly used options with the Linux cut command:

Option

Description

-b, --bytes=LIST

Selects only the bytes specified in LIST (e.g., -b 1-3,7).

-c, --characters=LIST

Selects only the characters specified in LIST (e.g., -c 1-3,7).

-d, --delimiter=DELIM

Uses DELIM as the field delimiter character instead of the tab character.

-f, --fields=LIS

Selects only the fields specified in LIST, separated by the delimiter character (default is tab).

-n

Do not split multi-byte characters (no effect unless -b or -c is specified).

--complement

Invert the selection of fields/characters. Print the fields/characters not selected.

--output-delimiter

Changes the output delimiter for fields in the cut command bash.

wc

wc (short for word count) is a command line tool in Unix/Linux operating systems, which is used to find out the number of newline count, word count, byte and character count in the files specified by the File arguments to the standard output and hold a total count for all named files.

When you define the File parameter, the wc command prints the file names as well as the requested counts. If you do not define a file name for the File parameter, it prints only the total count to the standard output. example:

[payam@earth Working]$ wc /etc/inittab 
 16  76 490 /etc/inittab

Three numbers shown below are 16(number of lines), 76 (number of words[by default space delimited]) and 490(number of bytes) of the file.

Syntax :

wc [OPTION]... [FILE]...

If no file is specified, it will read from standard input, meaning you can type text manually or pipe it from another command.

The followings are the options and usage provided by the wc command.

wc -l – Prints the number of lines in a file.
wc -w – prints the number of words in a file.
wc -c – Displays the count of bytes in a file.
wc -m – prints the count of characters from a file.
wc -L – prints only the length of the longest line in a file.

That's all.

sources:

example fruit file to play with it:

NAME,COLOR,SIZE
orange,orange,medium
grape,green,small
grape,red,small
apple,red,medium
banana,yellow,medium
watermelon,green,large
avocado,green,medium
lemon,yellow,medium
honeydew,green,large

Previous3.1 Archiving Files on the Command Line Next3.3 Turning Commands into a Script

Last updated 3 months ago

hashtag3.2 Searching and Extracting Data from Files

hashtagcat

hashtagstreams

hashtagI/O Redirection

hashtagpiping with |

hashtagView Kernel Messages in Linux

hashtagFilters

hashtaggrep

hashtagSearch for a word in a file

hashtagCommonly Used grep Options

hashtagRegular Expressions

hashtagegrep

hashtagHead and Tail Commands

hashtaghead

hashtagtail

hashtagsort

hashtagcut

hashtagcut by byte position:

hashtagcut by character:

hashtagcut based on a delimiter:

hashtagOptions Available in cut Command

hashtagwc