Purchase | Copyright © 2002 Paul Sheer. Click here for copying permissions. | Home |
All of UNIX is case sensitive. A command with even a single letter's capitalization altered is considered to be a completely different command. The same goes for files, directories, configuration file formats, and the syntax of all native programming languages. |
In addition to directories and ordinary text files, there are other types of files, although all files contain the same kind of data (i.e., a list of bytes). The hidden file is a file that will not ordinarily appear when you type the command ls to list the contents of a directory. To see a hidden file you must use the command ls -a. The -a option means to list all files as well as hidden files. Another variant is ls -l, which lists the contents in long format. The - is used in this way to indicate variations on a command. These are called command-line options or command-line arguments, and most UNIX commands can take a number of them. They can be strung together in any way that is convenient [Commands under the GNU free software license are superior in this way: they have a greater number of options than traditional UNIX commands and are therefore more flexible.], for example, ls -a -l, ls -l -a, or ls -al --any of these will list all files in long format.
All GNU commands take the additional arguments -h and --help. You can type a command with just this on the command-line and get a usage summary. This is some brief help that will summarize options that you may have forgotten if you are already familiar with the command--it will never be an exhaustive description of the usage. See the later explanation about man pages.
The difference between a hidden file and an ordinary file is merely that the file name of a hidden file starts with a period. Hiding files in this way is not for security, but for convenience.
The option ls -l is somewhat cryptic for the novice. Its more explanatory version is ls --format=long. Similarly, the all option can be given as ls --all, and means the same thing as ls -a.
Although commands usually do not display a message when they execute [The computer accepted and processed the command. ] successfully, commands do report errors in a consistent format. The format varies from one command to another but often appears as follows: command-name : what was attempted : error message. For example, the command ls -l qwerty gives an error ls: qwerty: No such file or directory. What actually happened was that the command ls attempted to read the file qwerty. Since this file does not exist, an error code 2 arose. This error code corresponds to a situation where a file or directory is not being found. The error code is automatically translated into the sentence No such file or directory. It is important to understand the distinction between an explanatory message that a command gives (such as the messages reported by the passwd command in the previous chapter) and an error code that was just translated into a sentence. The reason is that a lot of different kinds of problems can result in an identical error code (there are only about a hundred different error codes). Experience will teach you that error messages do not tell you what to do, only what went wrong, and should not be taken as gospel.
The file /usr/include/asm/errno.h contains a complete list of basic error codes. In addition to these, several other header files [Files ending in .h] might define their own error codes. Under UNIX, however, these are 99% of all the errors you are ever likely to get. Most of them will be meaningless to you at the moment but are included in Table 4.1 as a reference.
ls can produce a lot of output if there are a large number of files in a directory. Now say that we are only interested in files that ended with the letters tter. To list only these files, you can use ls *tter. The * matches any number of any other characters. So, for example, the files Tina.letter, Mary_Jones.letter and the file splatter, would all be listed if they were present, whereas a file Harlette would not be listed. While the * matches any length of characters, the ? matches only one character. For example, the command ls ?ar* would list the files Mary_Jones.letter and Harlette.
When naming files, it is a good idea to choose names that group files of the same type together. You do this by adding an extension to the file name that describes the type of file it is. We have already demonstrated this by calling a file Mary_Jones.letter instead of just Mary_Jones. If you keep this convention, you will be able to easily list all the files that are letters by entering ls *.letter. The file name Mary_Jones.letter is then said to be composed of two parts: the name, Mary_Jones, and the extension, letter.
Some common UNIX extensions you may see are:
In addition, files that have no extension and a capitalized descriptive name are usually plain English text and meant for your reading. They come bundled with packages and are for documentation purposes. You will see them hanging around all over the place.
Some full file names you may see are:
There is a way to restrict file listings to within the ranges of certain characters. If you only want to list the files that begin with A through M, you can run ls [A-M]*. Here the brackets have a special meaning--they match a single character like a ?, but only those given by the range. You can use this feature in a variety of ways, for example, [a-dJW-Y]* matches all files beginning with a, b, c, d, J, W, X or Y; and *[a-d]id matches all files ending with aid, bid, cid or did; and *.{cpp,c,cxx} matches all files ending in .cpp, .c or .cxx. This way of specifying a file name is called a glob expression. Glob expressions are used in many different contexts, as you will see later.
The command cp stands for copy. It duplicates one or more files. The format is
cp <file> <newfile>or
cp <file> [<file> ...] <dir>
cp file newfileThe above lines are called a usage summary. The < and > signs mean that you don't actually type out these characters but replace <file> with a file name of your own. These are also sometimes written in italics like, cp file newfile. In rare cases they are written in capitals like, cp FILE NEWFILE. <file> and <dir> are called parameters. Sometimes they are obviously numeric, like a command that takes <ioport>. [Anyone emailing me to ask why typing in literal, <, i, o, p, o, r, t and > characters did not work will get a rude reply.] These are common conventions used to specify the usage of a command. The [ and ] brackets are also not actually typed but mean that the contents between them are optional. The ellipses ... mean that <file> can be given repeatedly, and these also are never actually typed. From now on you will be expected to substitute your own parameters by interpreting the usage summary. You can see that the second of the above lines is actually just saying that one or more file names can be listed with a directory name last.
cp file [file ...] dir
From the above usage summary it is obvious that there are two ways to use the cp command. If the last name is not a directory, then cp copies that file and renames it to the file name given. If the last name is a directory, then cp copies all the files listed into that directory.
The usage summary of the ls command is as follows:
|
ls [-l, --format=long] [-a, --all] <file> <file> ... ls -al |
where the comma indicates that either option is valid. Similarly, with the passwd command:
|
passwd [<username>] |
You should practice using the cp command now by moving some of your files from place to place.
The cd command is used to take you to different directories. Create a directory new with mkdir new. You could create a directory one by doing cd new and then mkdir one, but there is a more direct way of doing this with mkdir new/one. You can then change directly to the one directory with cd new/one. And similarly you can get back to where you were with cd ../... In this way, the / is used to represent directories within directories. The directory one is called a subdirectory of new.
The command pwd stands for present working directory (also called the current directory) and tells what directory you are currently in. Entering pwd gives some output like /home/<username>. Experiment by changing to the root directory (with cd /) and then back into the directory /home/<username> (with cd /home/<username>). The directory /home/<username> is called your home directory, and is where all your personal files are kept. It can be used at any time with the abbreviation ~. In other words, entering cd /home/<username> is the same as entering cd ~. The process whereby a ~ is substituted for your home directory is called tilde expansion.
To remove (i.e., erase or delete) a file, use the command rm <filename>. To remove a directory, use the command rmdir <dir>. Practice using these two commands. Note that you cannot remove a directory unless it is empty. To remove a directory as well as any contents it might contain, use the command rm -R <dir>. The -R option specifies to dive into any subdirectories of <dir> and delete their contents. The process whereby a command dives into subdirectories of subdirectories of ... is called recursion. -R stands for recursively. This is a very dangerous command. Although you may be used to ``undeleting'' files on other systems, on UNIX a deleted file is, at best, extremely difficult to recover.
The cp command also takes the -R option, allowing it to copy whole directories. The mv command is used to move files and directories. It really just renames a file to a different directory. Note that with cp you should use the option -p and -d with -R to preserve all attributes of a file and properly reproduce symlinks (discussed later). Hence, always use cp -dpR <dir> <newdir> instead of cp -R <dir> <newdir>.
Commands can be given file name arguments in two ways. If you are in the same directory as the file (i.e., the file is in the current directory), then you can just enter the file name on its own (e.g., cp my_file new_file). Otherwise, you can enter the full path name, like cp /home/jack/my_file /home/jack/new_file. Very often administrators use the notation ./my_file to be clear about the distinction, for instance, cp ./my_file ./new_file. The leading ./ makes it clear that both files are relative to the current directory. File names not starting with a / are called relative path names, and otherwise, absolute path names.
(See Chapter 16 for a complete overview of all documentation on the system, and also how to print manual pages in a properly typeset format.)
The command man [<section>|-a] <command> displays help on a particular topic and stands for manual. Every command on the entire system is documented in so-named man pages. In the past few years a new format of documentation, called info, has evolved. This is considered the modern way to document commands, but most system documentation is still available only through man. Very few packages are not documented in man however.
Man pages are the authoritative reference on how a command works because they are usually written by the very programmer who created the command. Under UNIX, any printed documentation should be considered as being second-hand information. Man pages, however, will often not contain the underlying concepts needed for understanding the context in which a command is used. Hence, it is not possible for a person to learn about UNIX purely from man pages. However, once you have the necessary background for a command, then its man page becomes an indispensable source of information and you can discard other introductory material.
Now, man pages are divided into sections, numbered 1 through 9. Section 1 contains all man pages for system commands like the ones you have been using. Sections 2-7 contain information for programmers and the like, which you will probably not have to refer to just yet. Section 8 contains pages specifically for system administration commands. There are some additional sections labeled with letters; other than these, there are no manual pages besides the sections 1 through 9. The sections are
... /man1 | User programs |
... /man2 | System calls |
... /man3 | Library calls |
... /man4 | Special files |
... /man5 | File formats |
... /man6 | Games |
... /man7 | Miscellaneous |
... /man8 | System administration |
... /man9 | Kernel documentation |
You should now use the man command to look up the manual pages for all the commands that you have learned. Type man cp, man mv, man rm, man mkdir, man rmdir, man passwd, man cd, man pwd, and of course man man. Much of the information might be incomprehensible to you at this stage. Skim through the pages to get an idea of how they are structured and what headings they usually contain. Man pages are referenced with notation like cp(1), for the cp command in Section 1, which can be read with man 1 cp. This notation will be used from here on.
info pages contain some excellent reference and tutorial information in hypertext linked format. Type info on its own to go to the top-level menu of the entire info hierarchy. You can also type info <command> for help on many basic commands. Some packages will, however, not have info pages, and other UNIX systems do not support info at all.
info is an interactive program with keys to navigate and search documentation. Inside info, typing will invoke the help screen from where you can learn more commands.
You should practice using each of these commands.
|
LESS=-Q export LESS |
Those who come from the DOS world may remember the famous Norton Commander file manager. The GNU project has a Free clone called the Midnight Commander, mc. It is essential to at least try out this package--it allows you to move around files and directories extremely rapidly, giving a wide-angle picture of the file system. This will drastically reduce the number of tedious commands you will have to type by hand.
You should practice using each of these commands if you have your sound card configured. [I don't want to give the impression that LINUX does not have graphical applications to do all the functions in this section, but you should be aware that for every graphical application, there is a text-mode one that works better and consumes fewer resources.] You may also find that some of these packages are not installed, in which case you can come back to this later.
You usually use - to stop an application or command that runs continuously. You must type this at the same prompt where you entered the command. If this doesn't work, the section on processes (Section 9.5) will explain about signalling a running application to quit.
Files typically contain a lot of data that one can imagine might be represented with a smaller number of bytes. Take for example the letter you typed out. The word ``the'' was probably repeated many times. You were probably also using lowercase letters most of the time. The file was by far not a completely random set of bytes, and it repeatedly used spaces as well as using some letters more than others. [English text in fact contains, on average, only about 1.3 useful bits (there are eight bits in a byte) of data per byte.]Because of this the file can be compressed to take up less space. Compression involves representing the same data by using a smaller number of bytes, in such a way that the original data can be reconstructed exactly. Such usually involves finding patterns in the data. The command to compress a file is gzip <filename>, which stands for GNU zip. Run gzip on a file in your home directory and then run ls to see what happened. Now, use more to view the compressed file. To uncompress the file use gzip -d <filename>. Now, use more to view the file again. Many files on the system are stored in compressed format. For example, man pages are often stored compressed and are uncompressed automatically when you read them.
You previously used the command cat to view a file. You can use the command zcat to do the same thing with a compressed file. Gzip a file and then type zcat <filename>. You will see that the contents of the file are written to the screen. Generally, when commands and files have a z in them they have something to do with compression--the letter z stands for zip. You can use zcat <filename> | less to view a compressed file proper. You can also use the command zless <filename>, which does the same as zcat <filename> | less. (Note that your less may actually have the functionality of zless combined.)
A new addition to the arsenal is bzip2. This is a compression program very much like gzip, except that it is slower and compresses 20%-30% better. It is useful for compressing files that will be downloaded from the Internet (to reduce the transfer volume). Files that are compressed with bzip2 have an extension .bz2. Note that the improvement in compression depends very much on the type of data being compressed. Sometimes there will be negligible size reduction at the expense of a huge speed penalty, while occasionally it is well worth it. Files that are frequently compressed and uncompressed should never use bzip2.
You can use the command find to search for files. Change to the root directory, and enter find. It will spew out all the files it can see by recursively descending [Goes into each subdirectory and all its subdirectories, and repeats the command find. ] into all subdirectories. In other words, find, when executed from the root directory, prints all the files on the system. find will work for a long time if you enter it as you have--press - to stop it.
Now change back to your home directory and type find again. You will see all your personal files. You can specify a number of options to find to look for specific files.
|
find /usr -type f -exec ls '-al' '{}' ';' |
find has the deficiency of actively reading directories to find files. This process is slow, especially when you start from the root directory. An alternative command is locate <filename>. This searches through a previously created database of all the files on the system and hence finds files instantaneously. Its counterpart updatedb updates the database of files used by locate. On some systems, updatedb runs automatically every day at 04h00.
Try these ( updatedb will take several minutes):
5 |
updatedb locate rpm locate deb locate passwd locate HOWTO locate README |
Very often you will want to search through a number of files to find a particular word or phrase, for example, when a number of files contain lists of telephone numbers with people's names and addresses. The command grep does a line-by-line search through a file and prints only those lines that contain a word that you have specified. grep has the command summary:
|
grep [options] <pattern> <filename> [<filename> ...] |
[The words word, string, or pattern are used synonymously in this context, basically meaning a short length of letters and-or numbers that you are trying to find matches for. A pattern can also be a string with kinds of wildcards in it that match different characters, as we shall see later.]
Run grep for the word ``the'' to display all lines containing it: grep 'the' Mary_Jones.letter. Now try grep 'the' *.letter.
A package, called the mtools package, enables reading and writing to MS-DOS/Windows floppy disks. These are not standard UNIX commands but are packaged with most LINUX distributions. The commands support Windows ``long file name'' floppy disks. Put an MS-DOS disk in your A: drive. Try
|
mdir A: touch myfile mcopy myfile A: mdir A: |
Note that there is no such thing as an A: disk under LINUX. Only the mtools package understands A: in order to retain familiarity for MS-DOS users. The complete list of commands is
5 |
floppyd mcopy mformat mmount mshowfat mattrib mdel minfo mmove mtoolstest mbadblocks mdeltree mkmanifest mpartition mtype mcat mdir mlabel mrd mzip mcd mdu mmd mren xcopy |
Entering info mtools will give detailed help. In general, any MS-DOS command, put into lower case with an m prefixed to it, gives the corresponding LINUX command.
Never begin any work before you have a fail-safe method of backing it up. |
One of the primary activities of a system administrator is to make backups. It is essential never to underestimate the volatility [Ability to evaporate or become chaotic. ] of information in a computer. Backups of data are therefore continually made. A backup is a duplicate of your files that can be used as a replacement should any or all of the computer be destroyed. The idea is that all of the data in a directory [As usual, meaning a directory and all its subdirectories and all the files in those subdirectories, etc. ] are stored in a separate place--often compressed--and can be retrieved in case of an emergency. When we want to store a number of files in this way, it is useful to be able to pack many files into one file so that we can perform operations on that single file only. When many files are packed together into one, this packed file is called an archive. Usually archives have the extension .tar, which stands for tape archive.
To create an archive of a directory, use the tar command:
|
tar -c -f <filename> <directory> |
Create a directory with a few files in it, and run the tar command to back it up. A file of <filename> will be created. Take careful note of any error messages that tar reports. List the file and check that its size is appropriate for the size of the directory you are archiving. You can also use the verify option (see the man page) of the tar command to check the integrity of <filename>. Now remove the directory, and then restore it with the extract option of the tar command:
|
tar -x -f <filename> |
You should see your directory recreated with all its files intact. A nice option to give to tar is -v. This option lists all the files that are being added to or extracted from the archive as they are processed, and is useful for monitoring the progress of archiving. It is obvious that you can call your archive anything you like, however; the common practice is to call it <directory>.tar, which makes it clear to all exactly what it is. Another important option is -p which preserves detailed attribute information of files.
Once you have your .tar file, you would probably want to compress it with gzip. This will create a file <directory>.tar.gz, which is sometimes called <directory>.tgz for brevity.
A second kind of archiving utility is cpio. cpio is actually more powerful than tar, but is considered to be more cryptic to use. The principles of cpio are quite similar and its use is left as an exercise.
When you type a command at the shell prompt, it has to be read off disk out of one or other directory. On UNIX, all such executable commands are located in one of about four directories. A file is located in the directory tree according to its type, rather than according to what software package it belongs to. For example, a word processor may have its actual executable stored in a directory with all other executables, while its font files are stored in a directory with other fonts from all other packages.
The shell has a procedure for searching for executables when you type them in. If you type in a command with slashes, like /bin/cp, then the shell tries to run the named program, cp, out of the /bin directory. If you just type cp on its own, then it tries to find the cp command in each of the subdirectories of your PATH. To see what your PATH is, just type
|
echo $PATH |
You will see a colon separated list of four or more directories. Note that the current directory . is not listed. It is important that the current directory not be listed for reasons of security. Hence, to execute a command in the current directory, we hence always ./<command>.
To append, for example, a new directory /opt/gnome/bin to your PATH, do
|
PATH="$PATH:/opt/gnome/bin" export PATH |
LINUX supports the convenience of doing this in one line:
|
export PATH="$PATH:/opt/gnome/bin" |
There is a further command, which, to check whether a command is locatable from the PATH. Sometimes there are two commands of the same name in different directories of the PATH. [This is more often true of Solaris systems than LINUX.] Typing which <command> locates the one that your shell would execute. Try:
|
which ls which cp mv rm which which which cranzgots |
which is also useful in shell scripts to tell if there is a command at all, and hence check whether a particular package is installed, for example, which netscape.
If a file name happens to begin with a - then it would be impossible to use that file name as an argument to a command. To overcome this circumstance, most commands take an option --. This option specifies that no more options follow on the command-line--everything else must be treated as a literal file name. For instance
|
touch -- -stupid_file_name rm -- -stupid_file_name |