Powered By GitBook
103.2. Process text streams using filters

103.2 Process text streams using filters

Weight:3
Description: Candidates should be able to apply filters to text streams.
Key Knowledge Areas:
Send text files and output streams through text utility filters to modify the output using standard UNIX commands found in the GNU textutils package
Terms and Utilities:
    cat
    cut
    expand
    fmt
    head
    join
    less
    nl
    od
    paste
    pr
    sed
    sort
    split
    tail
    tr
    unexpand
    uniq
    wc
Everything in Linux revolves around streams of data—particularly text streams.

streams

A stream is nothing more than a sequence of bytes that is passed from one file, device, or program to another.
Input and output in the Linux environment is distributed across three streams (which are in fact special files).
These streams are:
    standard input stream (stdin), which provides input to commands.
    standard output stream (stdout), which displays output from commands.
    standard error stream (stderr), which displays error output from commands.
The streams are also numbered: stdin (0) ,stdout (1), stderr (2).

piping with |

Piping is a mechanism for sending data from one program to another. The operator we use is ( | ) (found above the backslash \ key on most keyboards). What this operator does is feed the output from the program on the left as input to the program on the right.
1
command1 | command2
Copied!
Either command can have options or arguments. We can also use | to redirect the output of the second command in the pipeline to a third command, and so on.
1
command 1 | command 2 | command3 | command 4 | ...
Copied!
Constructing long pipelines of commands that each have limited capability is a common Linux and UNIX way of accomplishing tasks.
1
[[email protected] ~]# dmesg | less
Copied!

Redirection

Linux includes redirection commands for each stream.We can use > in order to redirect output stream (mostly to a file).
1
[[email protected] temp]# ls -1
2
test.txt
3
unexpanded.txt
4
zip-3.0-11.el7.x86_64.rpm
5
zip.cpio
6
7
[[email protected] temp]# ls -1 > list.txt
8
9
[[email protected] temp]# cat list.txt
10
list.txt
11
test.txt
12
unexpanded.txt
13
zip-3.0-11.el7.x86_64.rpm
14
zip.cpio
Copied!
"|" vs ">"
The difference between > (redirection operator) and | (pipeline operator) is that while the > connects a command with a file, the | connects the output of a command with another command.

Text filtering

Text filtering is the process of taking an input stream of text and performing some conversion on the text before sending it to an output stream.

cat

The cat (short for “concatenate“) command is one of the most frequently used command in Linux/Unix like operating systems. cat command allows us to create single or multiple files, view contain of file, concatenate files and redirect output in terminal or files.
1
cat [OPTION] [FILE]...
Copied!
Simplest usage of cat is displaying the content of a file:
1
[[email protected] ~]# cat file1
2
This is 1st line of file1.
3
4
This is 3rd line of file1.
Copied!
is can show contents of Multiple Files :
1
[[email protected] ~]# cat file1 file2
2
This is 1st line of file1.
3
4
This is 3rd line of file1.
5
This is 1st line of file1.
6
This is 2nd line of file2.
7
8
This is 4th line of file2.
Copied!
The cat command also used to concatenate number of files together:
1
[[email protected] ~]# cat file1 file2 > newfile
2
[[email protected] ~]# cat newfile
3
This is 1st line of file1.
4
5
This is 3rd line of file1.
6
This is 1st line of file2.
7
This is 2nd line of file2.
8
9
This is 4th line of file2.
Copied!
create a new file with cat:
1
[[email protected] ~]# cat > newfile2
2
This is my second new file with input redirection
3
Ctrl+d
4
[[email protected] ~]# cat newfile2
5
This is my second new file with input redirection
Copied!
"-" A hyphen (used alone) generally signifies that input will be taken from stdin as opposed to a named file:
1
[[email protected] ~]# cat file1 - file2
2
This is 1st line of file1.
3
4
This is 3rd line of file1.
5
THIS IS MY INPUT
6
Ctrl+d
7
This is 1st line of file2.
8
This is 2nd line of file2.
9
10
This is 4th line of file2.
Copied!
List of cat command options:
1
-A, --show-all equivalent to -vET
2
-b, --number-nonblank number nonempty output lines, overrides -n
3
-e equivalent to -vE
4
-E, --show-ends display $ at end of each line
5
-n, --number number all output lines
6
-s, --squeeze-blank suppress repeated empty output lines
7
-t equivalent to -vT
8
-T, --show-tabs display TAB characters as ^I
9
-u (ignored)
10
-v, --show-nonprinting use
Copied!
Now what’s the opposite of cat? Yeah it’s ‘tac‘. tac is a command under Linux, try it for yourself.

od

od (Octal dump) command in Linux is used to output the contents of a file in different formats with the octal format being the default. This command is especially useful when debugging Linux scripts for unwanted changes or characters.
1
od [OPTION]... [FILE]...
Copied!
as and example:
1
[[email protected] ~]# cat testod.txt
2
1
3
2
4
3
5
4
6
5
7
[[email protected] ~]# od testod.txt
8
0000000 005061 005062 005063 005064 005065
9
0000012
Copied!
With -t option we can select output format and display it. (Traditional format specifications may be intermixed):
1
-a same as -t a, select named characters, ignoring high-order bit
2
-b same as -t o1, select octal bytes
3
-c same as -t c, select printable characters or backslash escapes
4
-d same as -t u2, select unsigned decimal 2-byte units
5
-f same as -t fF, select floats
6
-i same as -t dI, select decimal ints
7
-l same as -t dL, select decimal longs
8
-o same as -t o2, select octal 2-byte units
9
-s same as -t d2, select decimal 2-byte units
10
-x same as -t x2, select hexadecimal 2-byte units
Copied!
example:
1
[[email protected] ~]# od -ta testod.txt
2
0000000 1 nl 2 nl 3 nl 4 nl 5 nl
3
0000012
4
[[email protected] ~]# od -tc testod.txt
5
0000000 1 \n 2 \n 3 \n 4 \n 5 \n
6
0000012
Copied!
-A Option displays the contents of input in different format by concatenation some special character (offsets).
    Hexadecimal (using -x along with -A)
    Octal (using -o along with -A)
    Decimal (using -d along with -A)
1
[[email protected] ~]# od -Ax -c testod.txt
2
000000 1 \n 2 \n 3 \n 4 \n 5 \n
3
00000a
4
[[email protected] ~]# od -Ao -c testod.txt
5
0000000 1 \n 2 \n 3 \n 4 \n 5 \n
6
0000012
7
[[email protected] ~]# od -Ad -c testod.txt
8
0000000 1 \n 2 \n 3 \n 4 \n 5 \n
9
0000010
Copied!
-An Option displays the contents of input in character format but with no offset information:
1
[[email protected] ~]# od -An -c testod.txt
2
1 \n 2 \n 3 \n 4 \n 5 \n
Copied!

expand and unexpand

The expand command is used to convert tabs in files to spaces.
1
expand [OPTION]... [FILE]...
Copied!
lets try it :
1
[[email protected] ~]# cat test.txt
2
this is my test file.
3
[[email protected] ~]# od -tc -An test.txt
4
t h i s \t i s \t m y \t t e s t \t
5
f i l e . \n
6
[[email protected] ~]# expand test.txt > expanded.txt
7
[[email protected] ~]# od -tc -An expanded.txt
8
t h i s i s
9
m y t e s t
10
f i l e . \n
Copied!
By default, expand converts tabs into the corresponding number of spaces. But it is possible to tweak the number of spaces using the -t (– – tabs=N) command line option. This option requires us to enter the new number of spaces(N) we want the tabs to get converted.
1
[[email protected] ~]# expand -t1 test.txt > expanded2.txt
2
[[email protected] ~]# od -tc -An expanded2.txt
3
t h i s i s m y t e s t
4
f i l e . \n
Copied!
expand command options:
1
-i, --initial do not convert tabs after non blanks
2
-t, --tabs=NUMBER have tabs NUMBER characters apart, not 8
3
-t, --tabs=LIST use comma separated list of explicit tab positions
4
--help display this help and exit
5
--version output version information and exit
Copied!
The unexpand command is used to convert space characters (blanks) into tabs in each file(unexpand needs at least two spaces).
1
unexpand [OPTION]... [FILE]...
Copied!
Lets do reverse:
1
[[email protected] ~]# unexpand expanded.txt > unexpanded.txt
2
[[email protected] ~]# od -tc -An unexpanded.txt
3
t h i s i s
4
m y t e s t
5
f i l e . \n
Copied!
unexpand with no options just initial blanks!!! -a option convert all blanks, instead of just initial blanks:
unexpand only convert double spaces and more to tab, it doesn't convert single spaces!
1
[[email protected] ~]# unexpand -a expanded.txt > unexpanded2.txt
2
[[email protected] ~]# od -tc -An unexpanded2.txt
3
t h i s \t i s \t m y \t t e s t \t
4
f i l e . \n
Copied!
the unexpand command options:
1
-a, --all convert all blanks, instead of just initial blanks
2
--first-only convert only leading sequences of blanks (overrides -a)
3
-t, --tabs=N have tabs N characters apart instead of 8 (enables -a)
4
-t, --tabs=LIST use comma separated LIST of tab positions (enables -a)
5
--help display this help and exit
6
--version output version information and exit
Copied!

tr command

tr stands for translate. The tr utility copies the standard input to the standard output with substitution or deletion of selected characters. The syntax of tr command is:
1
tr [option] set1 [set2]
Copied!
Lets convert lower case to upper case:
1
[[email protected] ~]# echo "this is for test 123" | tr abcdefghijklmnopqrstuvwxyz ABCDEFGHIJKLMNOPQRSTUVWXYZ
2
THIS IS FOR TEST 123
Copied!
The following command will also convert lower case to upper case:
1
[[email protected] ~]# echo "this is for test 123" | tr [:lower:] [:upper:]
2
THIS IS FOR TEST 123
Copied!
Translate white-space to tabs:
1
[[email protected] ~]# echo "this is for test 123" | tr [:space:] '\t'
2
this is for test 123
Copied!
if there are two are more spaces present continuously, then the previous command will translate each spaces to a tab. We can use -s option to squeeze repetition of characters :
1
[[email protected] ~]# echo "this is for test 123" | tr -s [:space:] '\t'
2
this is for test 123
Copied!
-d option can be used to delete specified characters :
1
[[email protected] ~]# echo "this is for test 123" | tr -d 't'
2
his is for es 123
3
[[email protected] ~]# echo "this is for test 123" | tr -d [:digit:]
4
this is for test
Copied!
We complement the sets using -c option For example, to remove all characters except digits, you can use the following.:
1
[[email protected] ~]# echo "this is for test 123" | tr -dc [:digit:]
2
123
Copied!
tr has many options and sets try tr --help for more information.

pr

The pr command is used to format files for printing. The default header includes the filename and file creation date and time, along with a page number and two lines of blank footer.
1
[[email protected] ~]# cat note.txt
2
hi
3
this is my note file.
4
linux is an operating system.
5
learn linux.
6
7
[[email protected] ~]# pr note.txt
8
9
10
2018-12-19 11:25 note.txt Page 1
11
12
13
hi
14
this is my note file.
15
linux is an operating system.
16
learn linux.
Copied!
Note: When output is created from multiple files or the standard input stream, the current date and time are used instead of the filename and creation date.
We can print files side-by-side in columns and control many aspects of formatting through options.
1
[[email protected] ~]# rpm -qa | pr --columns=2 -l 15
2
2018-12-19 11:35 Page 1
3
4
5
NetworkManager-team-1.8.0-9.el7.x86 setserial-2.17-33.el7.x86_64
6
cdparanoia-libs-10.2-17.el7.x86_64 mozilla-filesystem-1.9-11.el7.x86_6
7
gtkmm30-3.22.0-1.el7.x86_64 hypervkvpd-0-0.30.20161211git.el7.x
8
python-configshell-1.1.fb23-3.el7.n less-458-9.el7.x86_64
9
khmeros-base-fonts-5.0-17.el7.noarc libgsf-1.14.26-7.el7.x86_64
10
liberation-fonts-common-1.07.2-15.e mariadb-libs-5.5.56-2.el7.x86_64
11
kexec-tools-2.0.14-17.el7.x86_64 rootfiles-8.1-11.el7.noarch
12
iptables-1.4.21-18.0.1.el7.centos.x langtable-data-0.0.31-3.el7.noarch
13
gnome-dictionary-libs-3.20.0-1.el7. libcgroup-tools-0.41-13.el7.x86_64
14
gd-2.0.35-26.el7.x86_64 perl-Pod-Usage-1.63-3.el7.noarch
Copied!
--column defines number of columns created in the output.-lspecifies page length (default is 66 lines).As usual, refer to the man page for details.

nl

nl is a linux command to number lines of the files, it copies its files to standard output, prepending line numbers.
1
nl [OPTION]... [FILE]...
Copied!
1
[[email protected] ~]# nl note.txt
2
1 hi
3
2 this is my note file.
4
3 linux is an operating system.
Copied!
-n FormatUses the value of the Format variable as the line numbering format. Recognized formats are:
    ln :Left-justified, leading zeros suppressed
    rn :Right-justified, leading zeros suppressed (default)
    rz: Right-justified, leading zeros kept
1
[[email protected] ~]# nl -nln note.txt
2
1 hi
3
2 this is my note file.
4
3 linux is an operating system.
5
6
[[email protected] ~]# nl -nrn note.txt #default
7
1 hi
8
2 this is my note file.
9
3 linux is an operating system.
10
11
[[email protected] ~]# nl -nrz note.txt
12
000001 hi
13
000002 this is my note file.
14
000003 linux is an operating system.
Copied!
By default nl skip over blank lines and does not give a number to them, use -ba switch to assign them numbers.
other ln options:
1
-b, --body-numbering=STYLE use STYLE for numbering body lines
2
-d, --section-delimiter=CC use CC for separating logical pages
3
-f, --footer-numbering=STYLE use STYLE for numbering footer lines
4
-h, --header-numbering=STYLE use STYLE for numbering header lines
5
-i, --line-increment=NUMBER line number increment at each line
6
-l, --join-blank-lines=NUMBER group of NUMBER empty lines counted as one
7
-n, --number-format=FORMAT insert line numbers according to FORMAT
8
-p, --no-renumber do not reset line numbers at logical pages
9
-s, --number-separator=STRING add STRING after (possible) line number
10
-v, --starting-line-number=NUMBER first line number on each logical page
11
-w, --number-width=NUMBER use NUMBER columns for line numbers
12
--help display this help and exit
13
--version output version information and exit
Copied!
cat -n filename does the same thing that nl command do.

fmt

fmt simple optimal text formatter, it reformats paragraphs in specified file and prints results to the standard output.
1
fmt [-WIDTH] [OPTION]... [FILE]...
Copied!
1
[[email protected] ~]# cat note.txt
2
hi
3
this is my note file.
4
linux is an operating system.
5
learn linux.
6
7
[[email protected] ~]# fmt note.txt
8
hi this is my note file. linux is an operating system. learn linux.
Copied!
By default fmt sets the column width at 75. This can be changed with the -w , --width=WIDTHoption.
1
[[email protected] ~]# fmt -w 12 note.txt
2
hi this
3
is my
4
note file.
5
linux is an
6
operating
7
system.
8
learn
9
linux.
Copied!
fmt command options:
1
-c, --crown-margin preserve indentation of first two lines
2
-p, --prefix=STRING reformat only lines beginning with STRING,
3
reattaching the prefix to reformatted lines
4
-s, --split-only split long lines, but do not refill
5
-t, --tagged-paragraph indentation of first line different from second
6
-u, --uniform-spacing one space between words, two after sentences
7
-w, --width=WIDTH maximum line width (default of 75 columns)
8
-g, --goal=WIDTH goal width (default of 93% of width)
9
--help display this help and exit
10
--version output version information and exit
Copied!

sort and uniq

Sort is a Linux program used for printing lines of input text files and concatenation of all files in sorted order. Sort command takes blank space as field separator and entire Input file as sort key. It is important to notice that sort command don’t actually sort the files but only print the sorted output, until your redirect the output.
1
sort [OPTION]... [FILE]...
Copied!
1
[[email protected] ~]# cat 1.txt
2
D 1
3
d 1
4
c 2
5
C 2
6
A 3
7
B 4
8
f 14
9
[[email protected] ~]# sort 1.txt
10
A 3
11
B 4
12
c 2
13
C 2
14
d 1
15
D 1
16
f 14
Copied!
If a file has words/lines beginning with both upper case and lower case characters, then sort displays those with upper case at top. However, we can change this behavior using the -f command line option:
1
[[email protected] ~]# sort -f 1.txt
2
A 3
3
B 4
4
C 2
5
c 2
6
D 1
7
d 1
8
f 14
Copied!
The -n option sort the contents numerically. Also we can sort a file base on "n"th column with -kn option:
1
[[email protected] ~]# sort -n -k2 1.txt
2
d 1
3
D 1
4
c 2
5
C 2
6
A 3
7
B 4
8
f 14
Copied!
user -r to reverse the result of comparisons. Other options of sort command:
1
-b, --ignore-leading-blanks ignore leading blanks
2
-d, --dictionary-order consider only blanks and alphanumeric characters
3
-f, --ignore-case fold lower case to upper case characters
4
-g, --general-numeric-sort compare according to general numerical value
5
-i, --ignore-nonprinting consider only printable characters
6
-M, --month-sort compare (unknown) < 'JAN' < ... < 'DEC'
7
-h, --human-numeric-sort compare human readable numbers (e.g., 2K 1G)
8
-n, --numeric-sort compare according to string numerical value
9
-R, --random-sort sort by random hash of keys
10
--random-source=FILE get random bytes from FILE
11
-r, --reverse reverse the result of comparisons
12
--sort=WORD sort according to WORD:
13
general-numeric -g, human-numeric -h, month -M,
14
numeric -n, random -R, version -V
15
-V, --version-sort natural sort of (version) numbers within text
Copied!
Sort can sort the contents of two files on standard output in one go! sort 1.txt 2.txt
uniq command is used to report or omit repeated lines, it filters lines from standard input and writes the outcome to standard output.
1
[[email protected] ~]# cat assets.txt
2
motherboard
3
motherboard
4
cpu
5
cpu
6
ram
7
ram
8
ram
9
ram
10
monitor
11
monitor
12
hdd
13
ssd
14
mouse
15
keyboard
16
keyboard
17
[[email protected] ~]# uniq assets.txt
18
motherboard
19
cpu
20
ram
21
monitor
22
hdd
23
ssd
24
mouse
25
keyboard
Copied!
use -c to display number of repetitions for each line:
1
[[email protected] ~]# uniq -c assets.txt
2
2 motherboard
3
2 cpu
4
4 ram
5
2 monitor
6
1 hdd
7
1 ssd
8
1 mouse
9
2 keyboard
Copied!
-d displays only the repeated lines and visa versa -u just shows uniq ones:
1
[[email protected] ~]# uniq -d assets.txt
2
motherboard
3
cpu
4
ram
5
monitor
6
keyboard
7
[[email protected] ~]# uniq -u assets.txt
8
hdd
9
ssd
10
mouse
Copied!
try -D to see all duplicated lines. other options from uniq --help :
1
-c, --count prefix lines by the number of occurrences
2
-d, --repeated only print duplicate lines, one for each group
3
-D, --all-repeated[=METHOD] print all duplicate lines
4
groups can be delimited with an empty line
5
METHOD={none(default),prepend,separate}
6
-f, --skip-fields=N avoid comparing the first N fields
7
--group[=METHOD] show all items, separating groups with an empty line
8
METHOD={separate(default),prepend,append,both}
9
-i, --ignore-case ignore differences in case when comparing
10
-s, --skip-chars=N avoid comparing the first N characters
11
-u, --unique only print unique lines
12
-z, --zero-terminated end lines with 0 byte, not newline
13
-w, --check-chars=N compare no more than N characters in lines
14
--help display this help and exit
15
--version output version information and exit
Copied!

split

split command is used to split or break a file into the pieces.
1
split [options] filename prefix
Copied!
    Replace filename with the name of the large file you wish to split.
    Replace prefix with the name you wish to give the small output files.
    We can exclude [options], or replace it with either of the following:
    -l linenumber
    -b bytes
If we use the -l (a lowercase L) option, replace line number with the number of lines we'd like in each of the smaller files (the default is 1,000).
1
[[email protected] split]# ls
2
my7lines.txt
3
[[email protected] split]# cat my7lines.txt
4
this is 1st line.
5
this is 2nd line.
6
this is 3rd line.
7
this is 4th line
8
this is 5th line.
9
this is 6th line.
10
this is 7th line.
11
12
[[email protected] split]# split -l 2 my7lines.txt
13
14
[[email protected] split]# ls
15
my7lines.txt xaa xab xac xad
16
17
[[email protected] split]# echo "xaa";cat xaa;echo "xab";cat xab;echo "xac";cat xac;echo "xad"; cat xad
18
xaa
19
this is 1st line.
20
this is 2nd line.
21
xab
22
this is 3rd line.
23
this is 4th line
24
xac
25
this is 5th line.
26
this is 6th line.
27
xad
28
this is 7th line.
Copied!
The split command will give each output file it creates the name prefix with an extension tacked to the end that indicates its order. By default, the split command adds aa to the first output file, proceeding through the alphabet to zz for subsequent files. If you do not specify a prefix, most systems use x.
If we use the -b option, replace bytes with the number of bytes you'd like in each of the smaller files.
1
[[email protected] split2]# du -ah
2
51M ./dsl-4.11.rc2.iso
3
51M .
4
5
[[email protected] split2]# split -b 10MB dsl-4.11.rc2.iso
6
[[email protected] split2]# ls
7
dsl-4.11.rc2.iso xaa xab xac xad xae xaf
8
[[email protected] split2]# du -ah
9
51M ./dsl-4.11.rc2.iso
10
9.6M ./xaa
11
9.6M ./xab
12
9.6M ./xac
13
9.6M ./xad
14
9.6M ./xae
15
2.7M ./xaf
16
101M .
Copied!
Some other options are:
1
-a, --suffix-length=N generate suffixes of length N (default 2)
2
--additional-suffix=SUFFIX append an additional SUFFIX to file names
3
-b, --bytes=SIZE put SIZE bytes per output file
4
-C, --line-bytes=SIZE put at most SIZE bytes of lines per output file
5
-d, --numeric-suffixes[=FROM] use numeric suffixes instead of alphabetic;
6
FROM changes the start value (default 0)
7
-e, --elide-empty-files do not generate empty output files with '-n'
8
--filter=COMMAND write to shell COMMAND; file name is $FILE
9
-l, --lines=NUMBER put NUMBER lines per output file
10
-n, --number=CHUNKS generate CHUNKS output files; see explanation below
11
-u, --unbuffered immediately copy input to output with '-n r/...'
12
--verbose print a diagnostic just before each
13
output file is opened
14
--help display this help and exit
15
--version output version information and exit
16
17
SIZE is an integer and optional unit (example: 10M is 10*1024*1024). Units
18
are K, M, G, T, P, E, Z, Y (powers of 1024) or KB, MB, ... (powers of 1000).
Copied!
For joining the splitted files use cat x* > orginalfile .

wc

The wc (word count) command is used to find out number of newline count, word count, byte and characters count in a file.
1
wc [options] filenames
Copied!
A Basic Example of WC Command
1
[[email protected] ~]# wc /etc/inittab
2
17 80 511 /etc/inittab
Copied!
Three numbers shown below are 17 (number of lines), 80 (number of words[by default space delimited]) and 511(number of bytes) of the file.
options:
1
-c, --bytes print the byte counts
2
-m, --chars print the character counts
3
-l, --lines print the newline counts
4
--files0-from=F read input from the files specified by
5
NUL-terminated names in file F;
6
If F is - then read names from standard input
7
-L, --max-line-length print the length of the longest line
8
-w, --words print the word counts
9
--help display this help and exit
10
--version output version information and exit
Copied!

head and tail commands:

head

The head command reads the first ten lines of a any given file name.
1
head [options] [file(s)]
Copied!
For example lets take a look at /var/log/yum.log file:
1
[[email protected] ~]# head /var/log/yum.log
2
Aug 26 04:48:25 Updated: openldap-2.4.44-15.el7_5.x86_64
3
Aug 26 04:48:25 Installed: openldap-clients-2.4.44-15.el7_5.x86_64
4
Aug 26 04:48:27 Installed: openldap-servers-2.4.44-15.el7_5.x86_64
5
Oct 13 03:38:41 Installed: perl-Data-Dumper-2.145-3.el7.x86_64
6
Oct 13 03:38:41 Installed: perl-Net-Daemon-0.48-5.el7.noarch
7
Oct 13 03:38:41 Installed: perl-Digest-1.17-245.el7.noarch
8
Oct 13 03:38:41 Installed: perl-Digest-MD5-2.52-3.el7.x86_64
9
Oct 13 03:38:41 Installed: 7:squid-migration-script-3.5.20-12.el7.x86_64
10
Oct 13 03:38:41 Installed: 1:perl-Compress-Raw-Zlib-2.061-4.el7.x86_64
11
Oct 13 03:38:42 Installed: libecap-1.0.0-1.el7.x86_64
Copied!
For retrieving desired number of lines use -n<number> or simple -<number> options:
1
[[email protected] ~]# head -n 2 /var/log/yum.log
2
Aug 26 04:48:25 Updated: openldap-2.4.44-15.el7_5.x86_64
3
Aug 26 04:48:25 Installed: openldap-clients-2.4.44-15.el7_5.x86_64
4
[[email protected] ~]# head -2 /var/log/yum.log
5
Aug 26 04:48:25 Updated: openldap-2.4.44-15.el7_5.x86_64
6
Aug 26 04:48:25 Installed: openldap-clients-2.4.44-15.el7_5.x86_64
Copied!
Options fromhead --help :
1
-c, --bytes=[-]K print the first K bytes of each file;
2
with the leading '-', print all but the last
3
K bytes of each file
4
-n, --lines=[-]K print the first K lines instead of the first 10;
5
with the leading '-', print all but the last
6
K lines of each file
7
-q, --quiet, --silent never print headers giving file names
8
-v, --verbose always print headers giving file names
9
--help display this help and exit
10
--version output version information and exit
11
12
K may have a multiplier suffix:
13
b 512, kB 1000, K 1024, MB 1000*1000, M 1024*1024,
14
GB 1000*1000*1000, G 1024*1024*1024, and so on for T, P, E, Z, Y.
Copied!

tail

tail command displays last ten lines of any text file.
1
tail [options] [filenames]
Copied!
Similar to the head command above, tail command also support options -n number of lines and n number of characters.
1
[[email protected] ~]# tail -n 5 /var/log/yum.log
2
Dec 05 10:02:37 Updated: firefox-60.3.0-1.el7.centos.x86_64
3
Dec 05 10:02:37 Updated: nss-tools-3.36.0-7.el7_5.x86_64
4
Dec 05 10:02:37 Updated: 1:dbus-x11-1.10.24-7.el7.x86_64
5
Dec 08 11:54:14 Installed: zip-3.0-11.el7.x86_64
6
Dec 08 13:10:52 Installed: vsftpd-3.0.2-22.el7.x86_64
Copied!
-f option will cause tail will loop forever, checking for new data at the end of the file(s). When new data appears, it will be printed. It works great with log files and lets us see what is going on:
1
[[email protected] ~]# tail -f /var/log/dmesg
2
[ 19.126805] AES CTR mode by8 optimization enabled
3
[ 19.129902] ppdev: user-space parallel port driver
4
[ 19.155269] alg: No test for __gcm-aes-aesni (__driver-gcm-aes-aesni)
5
[ 19.160414] Adding 4063228k swap on /dev/mapper/centos-swap. Priority:-1 extents:1 across:4063228k FS
6
[ 19.176446] alg: No test for crc32 (crc32-pclmul)
7
[ 19.220329] XFS (sda1): Mounting V5 Filesystem
8
[ 19.288818] XFS (sda1): Ending clean mount
9
[ 21.853831] floppy0: no floppy controllers found
10
[ 21.854029] work still pending
11
[ 22.082648] type=1305 audit(1543863215.542:3): audit_pid=681 old=0 auid=4294967295 ses=4294967295 res=1
Copied!
options:
1
-c, --bytes=K output the last K bytes; or use -c +K to output
2
bytes starting with the Kth of each file
3
-f, --follow[={name|descriptor}]
4
output appended data as the file grows;
5
an absent option argument means 'descriptor'
6
-F same as --follow=name --retry
7
-n, --lines=K output the last K lines, instead of the last 10;
8
or use -n +K to output starting with the Kth
9
--max-unchanged-stats=N
10
with --follow=name, reopen a FILE which has not
11
changed size after N (default 5) iterations
12
to see if it has been unlinked or renamed
13
(this is the usual case of rotated log files);
14
with inotify, this option is rarely useful
15
--pid=PID with -f, terminate after process ID, PID dies
16
-q, --quiet, --silent never output headers giving file names
17
--retry keep trying to open a file if it is inaccessible
18
-s, --sleep-interval=N with -f, sleep for approximately N seconds
19
(default 1.0) between iterations;
20
with inotify and --pid=P, check process P at
21
least once every N seconds
22
-v, --verbose always output headers giving file names
23
--help display this help and exit
24
--version output version information and exit
Copied!

less

less command allows you to view the contents of a file and navigate through file.
1
[[email protected] ~]# dmesg |less
Copied!
1
[ 0.000000] BIOS-e820: [mem 0x00000000bfeff000-0x00000000bfefffff] ACPI NVS
2
[ 0.000000] BIOS-e820: [mem 0x00000000bff00000-0x00000000bfffffff] usable
3
[ 0.000000] BIOS-e820: [mem 0x00000000f0000000-0x00000000f7ffffff] reserved
4
[ 0.000000] BIOS-e820: [mem 0x00000000fec00000-0x00000000fec0ffff] reserved
5
[ 0.000000] BIOS-e820: [mem 0x00000000fee00000-0x00000000fee00fff] reserved
6
[ 0.000000] BIOS-e820: [mem 0x00000000fffe0000-0x00000000ffffffff] reserved
7
[ 0.000000] BIOS-e820: [mem 0x0000000100000000-0x000000013fffffff] usable
8
:
Copied!
By default the only way to exit less command is to hit q key. To change this behavior and automatically exit file when reaching the end of file use the -e or -E option. less -e /var/log/auth.log or less -E /var/log/auth.log
    To open a file at the first occurrence of a pattern use the following syntax:
less +/sshd /var/log/auth.log
    In order to automatically append the content of a file opened in less command use the Shift+f keys combination or run less with the following syntax:
less +F /var/log/messages
This makes less to run in interactive mode (live) and display new content on-fly while waiting for new data to be written to file. This behavior is similar to tail -f command. To exit live mode just pressCtrl+ckeys.
Tip: In combination with a pattern you can watch the log file interactively withShift+fkey stroke while matching a keyword.
less vs more
less command is similar to more, he main difference between more and less is that less command is faster because it does not load the entire file at once and allows navigation though file using page up/down keys.
Whether you decide to use more or less, which is a personal choice, remember that less is more with more features.

cut

The cut command in UNIX is a command line utility for cutting sections from each line of files and writing the result to standard output. It can be used to cut parts of a line by byte position, character and delimiter.
1
cut OPTION... [FILE]...
Copied!
cut by byte position:
1
[[email protected] ~]# echo "linux" | cut -b 1
2
l
3
[[email protected] ~]# echo "linux" | cut -b 1,5
4
lx
5
[[email protected] ~]# echo "linux" | cut -b 1-4
6
linu
Copied!
cut by character:
1
[[email protected] ~]# echo '♣foobar' | cut -c 1,7
2
♣r
3
[[email protected] ~]# echo '♣foobar' | cut -c 5-7
4
bar
Copied!
cut based on a delimiter:
To cut using a delimiter use the -d option. This is normally used in conjunction with the -f option to specify the field that should be cut. examples:
1
[[email protected] ~]# cut 1.txt -d: -f1
2
1
3
2
4
3
5
4
6
[[email protected] ~]# cut 1.txt -d: -f2
7
a,w
8
b,x
9
c,y
10
d,z
11
[[email protected] ~]# cut 1.txt -d, -f1
12
1:a
13
2:b
14
3:c
15
4:d
16
[[email protected] ~]# cut 1.txt -d, -f2
17
w
18
x
19
y
20
z
Copied!
cut has lots of options:
1
-b, --bytes=LIST select only these bytes
2
-c, --characters=LIST select only these characters
3
-d, --delimiter=DELIM use DELIM instead of TAB for field delimiter
4
-f, --fields=LIST select only these fields; also print any line
5
that contains no delimiter character, unless
6
the -s option is specified
7
-n with -b: don't split multibyte characters
8
--complement complement the set of selected bytes, characters
9
or fields
10
-s, --only-delimited do not print lines not containing delimiters
11
--output-delimiter=STRING use STRING as the output delimiter
12
the default is to use the input delimiter
13
--help display this help and exit
14
--version output version information and exit
Copied!

paste

The paste command displays the corresponding lines of multiple files side-by-side.
1
paste [OPTION]... [FILE]...
Copied!
1
[[email protected] ~]# paste 1.txt 2.txt
2
a e
3
b f
4
c g
5
d h
Copied!
paste writes lines consisting of the sequentially corresponding lines from each FILE, separated by tabs.To apply a colon (:) as a delimiting character instead of tabs, use -d option:
1
[[email protected] ~]# paste -d: 1.txt 2.txt
2
a:e
3
b:f
4
c:g
5
d:h
Copied!
paste command options:
1
-d, --delimiters=LIST reuse characters from LIST instead of TABs
2
-s, --serial paste one file at a time instead of in parallel
3
--help display this help and exit
4
--version output version information and exit
Copied!

join

Joins the lines of two files which share a common field of data.
When using join, the input files must be sorted by the join field ONLY, otherwise you may see the warning
1
join [OPTION]... FILE1 FILE2
Copied!
1
[[email protected] ~]# join 1.txt 2.txt
2
1 a w
3
4 d z
4
5 e q
Copied!
By default, the join command only prints pairable lines. unpairable lines are left out in the output. However, if we want, we can still have them in the output using the -a command line option. This option requires you to pass a file number so that the tool knows which file you are talking about.
1
[[email protected] ~]# join 1.txt 2.txt -a 1
2
1 a w
3
2 b
4
4 d z
5
5 e q
6
[[email protected] ~]# join 1.txt 2.txt -a 2
7
1 a w
8
3 y
9
4 d z
10
5 e q
Copied!
Inorder to print unpaired lines (meaning, suppress the paired lines in output),use the -v command line option. This options works exactly the way -a works.
1
[[email protected] ~]# join 1.txt 2.txt -v 1
2
2 b
3
[[email protected] ~]# join 1.txt 2.txt -v 2
4
3 y
Copied!
join combines lines of files on a common field, which is the first field by default. However, if we want, we can specify a different field for each file using -1 and -2 command line options. for example join -1 2 -2 2 file1 file2 uses second field of each line. join command options:
1
-a FILENUM also print unpairable lines from file FILENUM, where
2
FILENUM is 1 or 2, corresponding to FILE1 or FILE2
3
-e EMPTY replace missing input fields with EMPTY
4
-i, --ignore-case ignore differences in case when comparing fields
5
-j FIELD equivalent to '-1 FIELD -2 FIELD'
6
-o FORMAT obey FORMAT while constructing output line
7
-t CHAR use CHAR as input and output field separator
8
-v FILENUM like -a FILENUM, but suppress joined output lines
9
-1 FIELD join on this FIELD of file 1
10
-2 FIELD join on this FIELD of file 2
11
--check-order check that the input is correctly sorted, even
12
if all input lines are pairable
13
--nocheck-order do not check that the input is correctly sorted
14
--header treat the first line in each file as field headers,
15
print them without trying to pair them
16
-z, --zero-terminated end lines with 0 byte, not newline
17
--help display this help and exit
18
--version output version information and exit
Copied!

sed

The name Sed is short for _s_tream _ed_itor. S stream editor is used to perform basic text transformations on an input stream (a file or input from a pipeline). sed uses regular expressions and the most basic (and popular) usage of sed is the substitution of characters.
As an example lets replace 'l' with "L" in a sample text file:
1
[[email protected] ~]# cat sample.txt
2
there are different operating systems in our planet.
3
one of them is linux.
4
almost six hundred linux distributions exist.
5
6
[[email protected] ~]# sed 's/l/L/' sample.txt
7
there are different operating systems in our pLanet.
8
one of them is Linux.
9
aLmost six hundred linux distributions exist.
Copied!
By default sed just perform the substitution just once for first instance of term, use -g flag to perform the substitution for all instances of term on every line of file.
1
[[email protected] ~]# sed 's/l/L/g' sample.txt
2
there are different operating systems in our pLanet.
3
one of them is Linux.
4
aLmost six hundred Linux distributions exist.
Copied!
Additionally, we can gi instead of g in order to ignore character case:
1
[[email protected] ~]# sed 's/linux/LINUX/gi' sample.txt
2
there are different operating systems in our planet.
3
one of them is LINUX.
4
almost six hundred LINUX distributions exist.
Copied!
Another example is replacing blank spaces with tab :
1
[[email protected] ~]# cat sample.txt
2
there are different operating systems in our planet.
3
one of them is linux.
4
almost six hundred linux distributions exist.
5
6
[[email protected] ~]# sed 's/ /\t/g' sample.txt | cat
7
there are different operating systems in our planet.
8
one of them is linux.
9
almost six hundred linux distributions exist.
Copied!
sed is extremely powerful, and the tasks it can accomplish are limited only by your imagination.
.
.
.
sources:
.
Last modified 5mo ago