103.3. Perform basic file management
103.3 Perform basic file management
Weight: 4
Description: Candidates should be able to use the basic Linux commands to manage files and directories.
Key Knowledge Areas:
Copy, move and remove files and directories individually
Copy multiple files and directories recursively
Remove files and directories recursively
Use simple and advanced wildcard specifications in commands
Using find to locate and act on files based on type, size, or time
Usage of tar, cpio and dd
Terms and Utilities:
cp
find
mkdir
mv
ls
rm
rmdir
touch
tar
cpio
dd
file
gzip
gunzip
bzip2
xz
file globbing
As we said Linux is the world of processes and files, in this section we start talking about file management in linux and we will talk about both files and directories (folders). Tools are pretty basic and their names are abstraction of what they really do.
ls
ls lists files and directories, and their associated metadata, such as file size, ownership, and modification time.
We can use absolute paths (ex:user1@ubuntu16-1:~$ ls /home/user1/Music
) or relative paths (ex: user1@ubuntu16-1:~$ ls Music/
) with ls.
With no options, ls lists the files contained in the current directory, sorting them alphabetically.
-1
prints outs each result in 1 line:
The default output of the ls command shows only the names of the files, which is not very informative. So lets use ls with -l
for long listing format :
The first character represents the file type: "-" for a regular file, "d" for a directory, "l" for a symbolic link(we will see them).
When the long listing format is used the ls command will display the following file information:
The file type
The file permissions
Number of hard links to the file
File owner
File group
File size
Date and Time
File name
Show Hidden Files
By default, the ls command will not show hidden files. In Linux, a hidden file is any file that begins with a dot (.).To display all files including the hidden files use the -a
option:
Some other usefull options:
Copying, Moving & Deleting
To make it easier lets classfied them in some groups:
Before going into how to use the touch command, let’s start by reviewing the file timestamps in Linux.
Linux Files Timestamps
In Linux every single file is associated with timestamps, and every file stores the information of last access time, last modification time and last change time. So, whenever we create new file, access or modify an existing file, the timestamps of that file automatically updated.
A file in Linux has three timestamps:
atime (access time) - The last time the file was accessed/opened by some command or application such as cat, vim or grep.
mtime (modify time) - The last time the file’s content was modified.
ctime (change time) - The last time the file’s attribute or content was changed. The attribute includes file permissions, file ownership or file location.
To display the file status including the timestamps you can use the stat
command.(we use ubuntu 16.04 here)
touch
The touch command is a standard command used in UNIX/Linux operating system which is used to create, change and modify timestamps of a file.
touch command (with no option) create an empty file if file does not exist.
if file exists touch command would change its time stamps .
creating an empty file:
touch command is used to create a file without any content. The file created using touch command is empty. This command can be used when the user doesn’t have data to store at the time of file creation.
example:
creating multiple empty files:
for example:
avoids creating new files:
-c
option tells that if the file does not exist, do not create it:
touch command timestamp options:
-a
: change the access time only-d
: update the access and modification times-m
: change the modification time only-r
: use the access and modification times of other file as reference.-t
: creates a file using a specified time
Lets do some timestamp modifications.
Create a file using a specified time:touch -t YYMMDDHHMM fileName
Change the access time:touch -a fileName
It only updates last access time.
Change the modification time:touch -m fileName
It only updates last modification time.
Explicitly Set the Access and Modification times: touch -c -t YYDDHHMM filename
Update access and modification time:touch -c -d fileName
We use the timestamp of another file with -r option: touch -r refrence_file_name file_name
and many more examples.
mkdir
mkdir command in Linux allows the user to create directories (also referred to as folders in some operating systems )
This command can create multiple directories at once as well as set the permissions for the directories. It is important to note that the user executing this command must have enough permissions to create a directory in the parent directory, or he/she may recieve a ‘permission denied’ error.
Lets try some of the most useful switches, -p
enables the command to create parent directories as necessary. If the directories exist, no error is specified:
We can create a directory and set the permissions(will be discussed later) for that directory at the same time using -m option:
The syntax of the mode is the same as the chmod command (will be discussed later).
with -v
option, it prints a message for each created directory.
cp
The cp
command is a command-line utility for copying files and directories. [Copies of files are independent of the original file( unlike the mv
command) ].The basic syntax of the cp command is:
cp can take 1 or more sources(s) but just one destination.
It supports copying one or more files or directories with options for taking backups and preserving attributes.
Do not forget to consider source and destination types (files or directory) when using cp command.
If the target is an existing directory, then all sources are copied into the target.
If the target is a directory that does not exist, then the (single) source must also be a directory and a copy of the source directory and its contents is made with the target name as the new name.
If the target is a file, then the (single) source must also be a file and a copy of the source file is made with the target name as the new name, replacing any existing file of the same name.
lets do some examples:
copy a file:
take a backup when copying a file
If a copy operation will overwrite a file the -b
flag may be used to create a back up of the file. This copies the file into place and writes a backup file.
specify the name of the backup file use the -S
option. try cp -S .file2bak file1 file2
copy multiple files (into a directory)
Copying files recursively
We can use cp to copy entire directory structures from one place to another using the -R
option to perform a recursive copy.
Let's say you are the user root and you have a directory, /home/test-space, which contains many files and subdirectories. You want to copy all those files, and all the subdirectories (and the files and subdirectories they contain), to a new location, /root/files-backup. You can copy all of them using the command:
cp -R ~/test-space ~/files-backup
If the directory files-backup already exists, the directory files will be placed inside.
If files-backup does not already exist, it will be created and the contents of the files directory will be placed inside it.
copy a directory
By default the cp command will not copy directories. Attempting to copy a directory results in an error.
To copy a directory pass the -R
flag. This will recursively copy a folder and create a copy.
copy multiple directories
To copy multiple directories pass the path of the directories to be copied followed by the destination directory.
Now lets take a look at some cp command options:
cp command has lots of options.Try man cp
command for more information.
mv
mv
is used to move or rename one or more files or directories. In general, the names we can use follow the same rules as for copying with cp; [If you are moving a file on the same file system, the inode wont change].
Rename a file( or directory) name source to destination:
Move source file(s) or directory(s) to a directory named destination:
Same as the previous syntax, but specifying the directory first, and the source file(s)(or directory(s)) last
Lets try mv command:
Renaming a File or a directory:
Moving Files into a directory:
Moving a directory into another :
we could usemv -t dirA dir3
command as well. -t, --target-directory
Move all sources into the directory destination.
usefull mv command options:
cp vs mv :Normally, the cp command will copy a file over an existing copy, if the existing file is writable. On the other hand, the mv will not move or rename a file if the target exists. We can overcome this using the
-f
switch.
rm
rm
stands for remove here. rm
removes files or directories.
Lets try:
Removing Files
File names starting with a dash, How to remove it?
To remove a file whose name begins with a dash ("-"), you can specify a double dash ("--") separately before the file name. This extra dash is necessary so that rm does not misinterpret the file name as an option: rm -- -file.txt
Or, we can delete it by referring to it with a pathname : rm /home/hope/-file.txt
some other options of rm command:
Removing directories
By default, rm
does not remove directories.
If the specified directory is empty, it may be removed with the -d/--dir option, instead.
What if desired directory contains some files? If the -r/-R/--recursive option is specified, we can remove that directory however rm will remove any matching directories and their contents!
rm -d
lets us to remove a directory without specifying -r/-R/--recursive, provided that the directory is empty. In other words, rm -d is equivalent to using rmdir.
rmdir
Removing directories using the rmdir command is the opposite of creating them. We can remove a directory with rmdir
only if it is empty as there is no option to force removal.
Again, there is a-p
option to remove parents as well. Let try it:
Normally, whenrmdir
is instructed to remove a non-empty directory, it reports an error. With --ignore-fail-on-non-empty
option suppresses those error messages.
Handling multiple files and directories
Some time we need to work on more than one files, Now we try to have review over some recursive commands
Recursive manipulation
Recursive listing
The ls command has a -R (note uppercase “R”) option for listing a directory and all its subdirectories. The recursive option applies only to directory names, for example, in a directory tree.
Recursive copy
We can use the -r (or -R or --recursive) option to cause the cp command to descend into source directories and copy the contents recursively. To prevent an infinite recursion, we cannot copy the source directory itself!
Recursive deletion
I mentioned earlier that rmdir only removes empty directories. We can use the -r
(or -R or --recursive) option to cause the rm command to remove both files and directories:
Wildcards and Globbing
File globbing is a feature provided by the UNIX/Linux shell to represent multiple filenames by using special characters called wildcards with a single file name. A wildcard is essentially a symbol which may be used to substitute for one or more characters. Therefore, we can use wildcards for generating the appropriate combination of file names as per our requirement.
?
is used to match any single character. We can use ‘?’ for multiple times for matching multiple characters.
*
is used to match zero or more characters. If we have less information to search any file or information then we can use ‘*’ in globbing pattern.
[ ]
is used to match the character from the range. Some of the mostly used range declarations are mentioned below:
[A-Z]
: All uppercase alphabets
[a-z]
: All lowercase alphabets
[a-zA-Z0-9]
: All uppercase alphabets, lowercase alphabet and digits
The
-
character between two others represents a range that includes the two other characters and all characters between them in the collating sequence.
The
!
character means NOT so it matches any character except the remaining characters.
{ }
can be used to match filenames with more than one globbing patterns. Each pattern is separated by ‘,’ in curly bracket without any space.
rm {*.doc,*.docx}
: delete all files whose extensions are ‘doc’ or ‘docx’.
and finally
\
is used as an "escape" character, we have used it to protect a subsequent special character. example: "\\” searches for a backslash
We can disable globbing using
set -f
command.
Wildcard patterns vs Regular Expressions
Wildcard patterns and regular expression patterns share some characteristics, but they are not the same. Pay careful attention.
Now that we’ve covered the file and directory topic with the big recursive hammer that hits everything, and the globbing hammer that hits more selectively, let’s look at the find command, which can be more like a surgeon’s knife.
Finding Files
The find command is used to find files in one or more directory trees, based on criteria such as name, time stamp, or size.
find
The find command searches for files or directories using all or part of the name, or by other search criteria, such as size, type, file owner, creation date, or last access date.
The
starting/path
attribute will define the top level directory where find begins filtering.The
options
attribute will control the behavior and optimization method of the find process.The
expression
attribute controls the tests that search the directory hierarchy to produce output.The most basic find is a search by name or part of a name:
-name
option used for searching for files based on their name. -i
makes it case insensitive. In the first example above, we found both files and a directory (/etc).
finding hidden files : If you want to find a file or directory whose name begins with a dot, such as .bashrc or the current directory (.), then you must specify a leading dot as part of the pattern. Otherwise, name searches ignore these files or directories. find . -name ".*"
note: If you want to chain different results together, you can use the “-and” or “-or” commands. The “-and” is assumed if omitted. find . -name file1 -or -name file9
Finding files by type
We can specify the type of files you want to find with the “-type” parameter. It works like this:
-type f
will search for a regular file-type d
will search for a directory-type l
will search for a symbolic link
Finding files by size
We can also search by file size, either for a specific size (n) or for files that are either larger (+n) or smaller than a given value (-n). By using both upper and lower size bounds, we can find files whose size is within a given range. By default, the -size option of find assumes a unit of ‘b’ for 512-byte blocks.
-size +/- [b] [c] [w] [k] [M} [G]
as an example lets find files smaller than 1 kilobytes:
We can find all empty files using find . -size 0b
or find . -empty
.
Finding files based on their time:
Linux stores time data about access times, modification times, and change times.
Access Time: Last time a file was read or written to.
Modification Time: Last time the contents of the file were modified.
Change Time: Last time the file’s inode meta-data was changed.
We can use the time stamps described with the touch command to locate files having particular time stamps.
again (+/-) signs can be used to give it a range.
Find Changed Files in Last 2 Hours:
note : Adding the -daystart
option to -mtime
or -atime
means that we want to consider days as calendar days, starting at midnight. So to list the regular files in your home directory that were modified yesterday we can use find ~/ -daystart -type f -mtime 1
.
We can also find files by owner and permissions and use filter result by depth (will be discussed in later sections"104-7")
Acting on files with two other switches:
As you can see find command has tons of options, get into more details using man page files.
Executing and Combining Find Commands (-exec)
We can execute an arbitrary helper command on everything that find matches by using the “-exec” parameter. This is called like this:
The {}
is used as a placeholder for the files that find matches. The “\;” is used so that find knows where the command ends.
As an instance this will remove all empty files:
We could remove all empty files in this directory and its subdirectories:
We could then change the directory permissions like this:
Identifying files
File names often have a suffix such as gif, jpeg, or html that gives a hint of what the file might contain. Linux does not require such suffixes and generally does not use them to identify a file type. Knowing what type of file you are dealing with helps you know what program to use to display or manipulate it.
file
The file command tells us something about the type of data in one or more files.
-b, –brief
: This is used to display just file type in brief mode.
-z
: Try to look inside compressed files.
and -i
option To view mime type of file.
A MIME type is a label used to identify a type of data. It is used so software can know how to handle the data. It serves the same purpose on the Internet that file extensions do on Microsoft Windows.
Archiving and Compressing files
Archiving and compressing are two different things
tar just archive files and does not do any compression by default.
zip does both archiving and compresion.
gzip and bzip2 are used just for compression.
zip
Zip
is one of the most popular archive file format out there. With zip, you can compress multiple files into one file. This not only saves disk space, it also saves network bandwidth. This is why you’ll encounter zip files almost all the time.
Lets take a look a look:
use the -r
option with the zip command and it will recursively zips the files in a directory. This option helps you to zip all the files present in the specified directory:
uzip
A separate companion program, unzip, unpacks and uncompresses zip archives.
Compressing files
gzip
Gzip (GNU zip) is a compressing tool, which is used to truncate the file size.
By default original file will be replaced by the compressed file ending with extension (.gz) and gzip removes the original files after creating the compressed file. gzip keeps the original file mode, ownership, and timestamp.
gunzip
To decompress a file we can use gunzip command and your original file will be back.
also we can use -d option to decompress a file using the “gzip” command.ex : gzip -d mydoc.gz
bzip2
Like gzip, bzip2 command in Linux is used to compress and decompress the files. It uses different compression algorithm so it compress files better than gzip, but It has a slower decompression time and higher memory use.
bunzip2
bunzip2 is used for decompression bzip2 files:
also -d option is used for decompression of compressed files.
bzip2 doesn't any options for compressing a directory, so use tar with that. How? read tar section.
xz
xz is a new general-purpose, command line data compression utility, similar to gzip and bzip2. It can be used to compress or decompress a file according to the selected operation mode. It supports various formats to compress or decompress files.
Selecting a compression utility to use will depend mainly on two factors, the compression speed and rate of a given tool. Unlike its counterparts, xz is not commonly used but offers the best compression.
or we could use xz -x
file to compress that. -d
is used for decompression:
Archiving files
What is an Archive file?
An Archive file is a file that is composed of one or more files along with metadata. Archive files are used to collect multiple data files together into a single file for easier portability and storage, or simply to compress files to use less storage space.
tar
The Linux ‘tar’ stands for tape archive, is used to create Archive and extract the Archive files. tar command in Linux is one of the important command which provides archiving functionality in Linux.
Here a few common use cases of the tar command:
Backup of Servers and Desktops.
Document archiving.
Software Distribution.
lets start:
Create tar Archive File
We can include more than one directory, also it is possible to exclude with --exclude
option.
note: by default tar uncompress file in your current directory and it can make some problems(overwriting ,...), for avoiding that use tar -xvf backupfile -C /restoreDir
command. -C
means change the directory before extracting the backup.
Untar tar Archive File
We can use Linux tar command to create compressed or uncompressed Archive files and also maintain and modify them.
for decompressing tar.gzip use tar -xzvf file.tar.gz
and tar -xjvf file.tar.bz2
for bzip2 files.
note: -r
option can not append any files to a compressed file.
we usually use mixture of tar options to gain what we want:
use tar -cJf file.tar.xz
to create xz compressed file and tar -xJf file.tar.xz
for extracting.
dd
dd stands for Convert & Copy but why it is not cc? because the name cc is already used by c compiler. Many people call it Disk Destroyer because dd doesn't care at all about file system and strickly works with Block Devices!
The command line syntax of dd differs from many other Unix programs, in that it uses the syntax option=value for its command line options, rather than the more-standard -option value or –option=value formats. By default, dd reads from stdin and writes to stdout, but these can be changed by using the if (input file) and of (output file) options.
dd takes an input file line (ex:/dev/sda) and it writes it to the out put file (ex:/dev/sdb) we specify, bs is block size, how big we want to write blocks and it is not neseccary and can be omitted.
Backup the entire harddisk:
We can even use dd to copy any kind of block devices and as dd works on block devices itself it doesn't matter if partion ups.
If there are any errors, the above command will fail. If you give the parameter “conv=noerror” then it will continue to copy if there are read errors.
dd if = /dev/sda of = /dev/sdb conv=noerror
Input and output should be mentioned very carefully. Just in case, you mention source device in the target and vice versa, you might loss all your data.
To copy, hard drive to hard drive using dd command given below, sync option allows you to copy everything using synchronized I/O.
dd if = /dev/sda of = /dev/sdb conv=noerror, sync
conv can do many thing such as Converting a file to uppercase or visa versa.
cpio
cpio
stands for “Copy in, copy out“. It is used for processing the archive files like .cpio or .tar. This command can copy files to and from archives.
Copy-out Mode: Copy files named in name-list to the archive
command | cpio -o > archive
Copy-in Mode: Extract files from the archive
cpio -i < archive
Copy-pass Mode: Copy files named in name-list to destination-directory
cpio -p destination-directory < name-list
.
.
.
.
Linux Devices (beyond the scop of LPIC1 exam)
In Linux various special files can be found under the directory /dev
. These files are called device files and behave unlike ordinary files.
The columns are as follows from left to right:
Permissions
Owner
Group
Major Device Number
Minor Device Number
Timestamp
Device Name
Remember in the ls command you can see the type of file with the first bit on each line. Device files are denoted as the following:
The most common types of device files are for block devices and character devices. These files are an interface to the actual driver (part of the Linux kernel) which in turn accesses the hardware. Another, less common, type of device file is the named pipe:
Character Device:These devices transfer data, but one a character at a time. You'll see a lot of pseudo devices (/dev/null) as character devices, these devices aren't really physically connected to the machine, but they allow the operating system greater functionality.
Block Device:These devices transfer data, but in large fixed-sized blocks. We'll most commonly see devices that utilize data blocks as block devices, such as harddrives, filesystems, etc.
Pipe Device:Named pipes allow two or more processes to communicate with each other, these are similar to character devices, but instead of having output sent to a device, it's sent to another process.
Socket Device:Socket devices facilitate communication between processes, similar to pipe devices but they can communicate with many processes at once.
Device Characterization (Major Device Number & Minor Device Number)
Devices are characterized using two numbers, major device number and minor device number. We can see these numbers in the above ls example, they are separated by a comma. For example, let's say a device had the device numbers: 8, 0:
The major device number represents the device driver that is used, in this case 8, which is often the major number for sd block devices. The minor number tells the kernel which unique device it is in this driver class, in this case 0 is used to represent the first device (a).
.
.
https://www.computerhope.com/unix/uls.htm
https://linuxize.com/post/how-to-list-files-in-linux-using-the-ls-command/
https://www.rapidtables.com/code/linux/ls.html
https://www.tecmint.com/8-pratical-examples-of-linux-touch-command/
https://www.geeksforgeeks.org/touch-command-in-linux-with-examples/
https://linuxize.com/post/linux-touch-command/
https://www.geeksforgeeks.org/mkdir-command-in-linux-with-examples/
https://www.lifewire.com/create-directories-linux-mkdir-command-3991847
https://developer.ibm.com/tutorials/l-lpic1-103-3/
https://shapeshed.com/unix-cp/
https://www.rapidtables.com/code/linux/cp.html
https://www.computerhope.com/unix/ucp.htm
https://www.rapidtables.com/code/linux/mv.html
https://www.computerhope.com/unix/umv.htm
https://jadi.gitbooks.io/lpic1/content/1033_perform_basic_file_management.html
https://www.computerhope.com/unix/urm.htm
https://www.computerhope.com/unix/urmdir.htm
https://www.linuxnix.com/10-file-globbing-examples-linux-unix/
https://linuxhint.com/bash_globbing_tutorial/
https://www.linode.com/docs/tools-reference/tools/find-files-in-linux-using-the-command-line/
https://www.lifewire.com/uses-of-linux-command-find-2201100
https://www.geeksforgeeks.org/tar-command-linux-examples/
https://itsfoss.com/linux-zip-folder/
https://www.geeksforgeeks.org/zip-command-in-linux-with-examples/
https://www.geeksforgeeks.org/gzip-command-linux/
https://www.geeksforgeeks.org/bzip2-command-in-linux-with-examples/
https://www.tecmint.com/xz-command-examples-in-linux/
https://www.debian.org/releases/wheezy/amd64/apds01.html.en
https://linuxjourney.com/lesson/device-types
https://www.geeksforgeeks.org/dd-command-linux/
https://linoxide.com/linux-command/linux-dd-command-create-1gb-file/
https://www.geeksforgeeks.org/cpio-command-in-linux-with-examples/
and whith the special thanks from shawn powers.
Last updated