3.5. Shell scripting

The generic term shell is used to refer to a program that serves as an interface between the user and the GNU/Linux system's kernel. In this section, we will focus on the interactive text shells, which are what we will find as users once we have logged in the system.

Important

The shell is a system utility that allows users to interact with the kernel through the interpretation of commands that the user enters in the command line or files of the shell script type.

The shell is what the users see of the system. The rest of the operating system remains mostly hidden from them. The shell is written in the same way as a user process (program); it does not form part of the kernel, but rather is run like just another user program.

When our GNU/Linux system starts up, it tends to offer users an interface with a determined appearance; the interface may be a text or graphic interface. Depending on the modes (or levels) of booting the system, whether with the different text console modes or modes that give us a direct graphic start up in X Window.

In graphic start up modes, the interface consists of an access administrator to manage the user login procedure using a graphic cover page that asks for the corresponding information to be entered: user identification and passsword. Access managers are common in GNU/Linux: xdm (belonging to X Window), gdm (Gnome) and kdm (KDE), as well as a few others associated to different window managers. Once we have logged in, we will find ourselves in the X Window graphic interface with a windows manager such as Gnome or KDE. To interact through an interactive shell, all we will need to do is to open one of the available terminal emulation programs.

If our access is in console mode (text), once logged in, we will obtain direct access to the interactive shell.

Another case of obtaining an interactive shell is by remote access to the machine, whether through any of the text possibilities such as telnet, rlogin, ssh, or graphic possibilities such as the X Window emulators.

3.5.1. Interactive shells

Having initiated the interactive shell [Qui01], the user is shown a prompt, indicating that a command line may be entered. After entering it, the shell becomes responsible for validating it and starting to run the required processes, in a number of phases:

Normally, command lines will be ways of running the system's commands, interactive shell commands, starting up applications or shell scripts.

Important

Shell scripts are text files that contain command sequences of the system, plus a series of internal commands of the interactive shell, plus the necessary control structures for processing the program flow (of the type while, for etc.).

The system can run script files directly under the name given to the file. To run them, we invoke the shell together with the file name or we give the shell script execution permissions.

To some extent, we can see shell script as the code of an interpreted language that is executed on the corresponding interactive shell. For the administrator, shell scripts are very important, basically for two reasons:

1) The system's configuration and most of the services are provided through tools in the form of shell scripts.

2) The main way of automating administration processes is creating shell scripts.

All the programs that are invoked by a shell possess three predefined files, specified by the corresponding file handles. By default, these files are:

1) standard input: normally assigned to the terminal's keyboard (console); uses file handle number 0 (in UNIX the files use whole number file handles).

2) standard output: normally assigned to the terminal's screen; uses file handle 1.

3) standard error: normally assigned to the terminal's screen; uses file handle 2.

This tells us that any program run from the shell by default will have the input associated to the terminal's keyboard, the output associated to the screen, and that it will also send errors to the screen.

Also, the shells tend to provide the three following mechanisms:

1) Redirection: given that I/O devices and files are treated the same way in UNIX, the shell simply handles them all as files. From the user's point of view, the file handles can be reassigned so that the data flow of one file handle goes to any other file handle; this is called redirection. For example, we refer to redirecting file handles 0 or 1 as redirecting standard I/O.

2) Pipes: a program's standard output can be used as another's standard input by means of pipes. Various programs can be connected to each other using pipes to create what is called a pipeline.

3) Concurrence of user programs: users can run several programs simultaneously, indicating that they will be run in the background, or in the foreground, with exclusive control of the screen. Another way consists of allowing long jobs in the background while interacting with the shell and with other programs in the foreground.

In practice, in UNIX/Linux these shells entail:

Example 3-5. Example

let's see

command op file

where op may be:

  • < : receive input from file.

  • > : send output to file.

  • >> : it indicates to add the output (by default, with > the file is created again).

3.5.2. Shells

The shell's independence in relation to the operating system's kernel (the shell is just an interface layer), allows us to have several of them on the system [Qui01]. Although some of the more frequent ones are:

a) The Bash (initialism for Bourne-again shell). The default GNU/Linux shell.

b) The Bourne shell (sh). This has always been the standard UNIX shell, and the one that all UNIX systems have in some version. Normally, it is the administrator's default shell (root). In GNU/Linux it tends to be Bash, an improved version of the Bourne shell, which was created by Stephen Bourne at AT&T at the end of the seventies. The default prompt tends to be a '$' (in root a '#').

c) The Korn shell (ksh). It is a supergroup of Bourne (some compatibility is maintained), written at AT&T by David Korn (in the mid eighties), which some functionalities of Bourne and C, with some additions. The default prompt is the $.

d) The C shell (csh). It was developed at the University of Berkeley by Bill Joy towards the end of the seventies and has a few interesting additions to Bourne, like a command log, alias, arithmetic from the command line, it completes file names and controls jobs in the background. The default prompt for users is '%'. UNIX users tend to prefer this shell for interaction, but UNIX administrators prefer to use Bourne, because the scripts tend to be more compact and to execute faster. At the same time, an advantage of the scripts in C shell is that, as the name indicates, the syntax is based on C language (although it is not the same).

e) Others, such as restricted or specialised versions of the above.

The Bash (Bourne again shell) [Bas] [Coo] has grown in importance since it was included in GNU/Linux systems as the default shell. This shell forms part of the GNU software project. It is an attempt to combine the three preceding shells (Bourne, C and Korn), maintaining the syntax of the original Bourne shell. This is the one we will focus on in our subsequent examples.

A rapid way of knowing what shell we are in as users is by using the variable $SHELL, from a command line with the instruction:

echo $ SHELL

We will find that some aspects are common to all shells:

• They all allow shell scripts to be written, which are then interpreted executing them either by the name (if the file has an execution permission) or by passing it as a parameter to the command of the shell.

• System users have a default shell associated to them. This information is provided upon creating the users' accounts. The administrator will assign a shell to each user, or otherwise the default shell will be assigned (bash in GNU/Linux). This information is saved in the passwords file in /etc/passwd and can be changed with the chsh command, this same command with the option -l will list the system's available shells (see also /etc/shells).

• Every shell is actually an executable command, normally present in the /bin directories in GNU/Linux (or /usr/bin).

• Shell scripts can be written in any of them, but adjusting to each one's syntax, which is normally different (sometimes the differences are minor). The construction syntax, as well as the internal commands, are documented in every shell's man page (man bash for example).

• Every shell has some associated start up files (initialisation files), and every user can adjust them to their needs, including code, variables, paths...

• The capacity in the programming lies in the combination of each shell's syntax (of its constructions), with the internal commands of each shell, and a series of UNIX commands that are commonly used in the scripts, like for example cut, sort, cat, more, echo, grep, wc, awk, sed, mv, ls, cp...

Example 3-6. Note

To program a shell it is advisable to have a good knowledge of these UNIX commands and of their different options.

• If as users we are using a specific shell, nothing prevents us from starting up a new copy of the shell (we call it a subshell), whether it is the same one or a different one. We simply invoke it through the name of the executable, whether sh, bash, csh or ksh. Also when we run a shell script a subshell is launched with the corresponding shell for executing the requested script.

Some basic differences between them [Qui01]:

a) Bash is the default shell in GNU/Linux (unless otherwise specified in creating the user account). In other UNIX systems it tends to be the Bourne shell (sh). Bash is compatible with sh and also incorporates some features of the other shells, csh and ksh.

b) Start-up files: sh, ksh have .profile (in the user account, and is executed in the user's login) and ksh also tends to have a .kshrc which is executed next, csh uses .login (it is run when the user login initiates one time only), .logout (before leaving the user's session) and .cshrc (similar to the .profile, in each initiated C subshell). And Bash uses the .bashrc and the .bash_profile. Also, the administrator can place common variables and paths in the /etc/profile file that will be executed before the files that each user has. The shell start-up files are placed in the user's account when it is created (normally they are copied from the /etc/skel directory), where the administrator can leave some skeletons of the prepared files.

c) The system or service configuration scripts are usually written in Bourne shell (sh), since most UNIX systems used them this way. In GNU/Linux we can also find some in Bash and also in other script languages not associated to the shell such as Perl or Python.

d) We can identify what shell the script is run on using the file command, for example file <scriptname>. Or by examining the first line of the script, which tends to be: #!/bin/name, where the name is bash, sh, csh, ksh... This line tells us, at the moment of running the script, what shell needs to be used to interpret it (in other words, what subshell needs to be launched in order to run it). It is important for all scripts to contain it, since otherwise they will try to run the default shell (Bash in our case) and the syntax may not be the right one, causing many syntax errors in the execution.

3.5.3. System variables

Some useful system variables (we can see them using the echo command for example), which can be consulted in the command line or within the programming of the shell scripts are:

The different variables of the environment can be seen using the env command. For example:

$ env
SSH_AGENT_PID = 598
MM_CHARSET = ISO-8859-15
TERM = xterm
DESKTOP_STARTUP_ID =
SHELL = /bin/bash
WINDOWID = 20975847
LC_ALL = es_ES@euro
USER = juan
LS_COLORS = no = 00:fi = 00:di = 01;34:ln = 01;
SSH_AUTH_SOCK = /tmp/ssh-wJzVY570/agent.570
SESSION_MANAGER = local/aopcjj:/tmp/.ICE-unix/570
USERNAME = juan
PATH=/soft/jdk/bin:/usr/local/bin:/usr/bin:/bin:/usr/bin/X11:/usr/games
MAIL = /var/mail/juan
PWD = /etc/skel
JAVA_HOME = /soft/jdk
LANG = es_ES@euro
GDMSESSION = Gnome
JDK_HOME = /soft/jdk
SHLVL = 1
HOME = /home/juan
GNOME_DESKTOP_SESSION_ID = Default
LOGNAME = juan
DISPLAY = :0.0
COLORTERM = gnome-terminal
XAUTHORITY = /home/juan/.Xauthority
_ = /usr/bin/env
OLDPWD = /etc

3.5.4. Programming scripts in Bash

Here we will look at some basic concepts of the shell scripts in Bash, we advise further reading in [Bas] [Coo].

All Bash scripts have to start with the line:

#!/bin/bash

This line indicates the shell used by the user, the one active at the time, what shell is needed for running the script that appears next.

The script can be run in two different ways:

1) By running directly from the command line, on condition it has an execution permission. If this is not the case, we can establish the permission with: chmod +x script.

2) By running through the shell, we call on the shell explicitly: /bin/bash script.

We should take into account that, irrespective of the method of execution, we are always creating a subshell where our script will be run.

3.5.4.1. Variables in Bash

The assignment of variables is done by:

variable = value

The value of the variable can be seen with:

echo $variable

where '$' refers us to the variable's value.

The default variable is only visible in the script (or in the shell). If the variable needs to be visible outside the script, at the level of the shell or any subshell that is generated a posteriori, we will need to "export" it as well as assign it. We can do two things:

  • Assign first and export after:

var=value
export var

  • Export during assignment:

export var=value

In Bash scripts we have some accessible predetermined variables:

  • $1-$N: It saves past arguments as parameters to the script from the command line.

  • $0: It saves the script name, it would be parameter 0 of the command line.

  • $*: It saves all parameters from 1 to N of this variable.

  • $: It saves both parameters, but with double inverted commas (" ") for each of them.

  • $?: "Status": it saves the value returned by the most recent executed command. Useful for checking error conditions, since UNIX tends to return 0 if the execution was correct, and a different value as an error code.

Another important issue regarding assignments is the use of inverted commas:

  • Double inverted commas allow everything to be considered as a unit.

  • Single inverted commas are similar, but ignore the special characters inside them.

  • Those pointed to the left (`command`) are used for evaluating the inside, if there is an execution or replacement to be made. First the content is executed, and then what there was is replaced by the result of the execution. For example: var = 'ls' saves the list of the directory in $var.

3.5.4.2. Comparisons

For conditions the order test expression tends to be used or directly [expression]. We can group available conditions in:

  • Numerical comparison: -eq, -ge, -gt, -le, -lt, -ne, corresponding to: equal to, greater than or equal to (ge), greater than, less than or equal to (le), less than, not equal to.

  • Chain comparison: :=, !=, -n, -z, corresponding to chains of characters: equal, different, with a greater length than 0, length equal to zero or empty.

  • File comparison: -d, -f -r, -s, -w, -x. The file is: a directory, an ordinary file, is readable, is not empty, is writable, is runnable.

  • Booleans between expressions: !, -a, -o, conditions of not, and, and or.

3.5.4.3. Control structures

Regarding the script's internal programming, we need to think that we are basically going to find:

  • Commands of the operating system itself.

  • Internal commands of the Bash (see: man bash).

  • Programming control structures (for, while...), with the syntax of Bash.

The basic syntax of control structures is as follows:

a) Structure if...then, evaluates the expression and if a certain value is obtained, then the commands are executed.

if [ expresion ] 
	then
		commands
fi

b) Structure if...then...else, evaluates the expression and if a certain value is obtained then the commands1 are executed, otherwise comands2 are executed:

if [ expresion ] 
	then
		commands1 
	else
		commands2
fi

c) Structure if..then...else if...else, same as above, with additional if structures.

if [ expresion ] 
	then
		commands 
	elif [ expresion2 ]
	then 
		commands
	else
		commands
fi

d) Structure case select, multiple selection structure according to the selection value (in case)

Example 3-7. Note

Shells such as Bash offer a wide set of control structures that make them comparable to any other language.

case string1 in 
	str1)
		commands;; 
	str2)
		commands;;
	*)
		commands;;
esac

e) Loop for, replacement of the variable for each element of the list:

for var1 in list 
do
		commands 
done

f) Loop while, while the expression is fulfilled:

while [ expresion ] 
do
		commands 
done

g) Loop until, until the expression is fulfilled:

until [ expression ] 
do
		commands 
done

h) Declaration of functions:

fname() { 
		commands
}

or with a call accompanied by parameters:

fname2(arg1,arg2...argN) { 
		commands
}

and function calls with fname or fname2 p1 p2 p3 ... pN.