SCL

// doc

1 Introduction

General shell operations

  1. The shell reads its input from a file (-c option, system() or popen())
  2. Breaks input into tokens
  3. Parses input into simple commands and compound commands
  4. Performs word expensions
  5. Performs redirection
  6. Executes functions, scripts, builtins or executable files, indexing its positionnal parameters from 1 to n and the name of the script as index 0
  7. Waits for the command to complete and collect exit status

2 Quoting

Quoting is used to :

These need to be quoted to keep their literal meaning :

| & ; < > ( ) $ ` \ " ’

And these ones to, under special circumstances

* ? [ # ˜ = %

Backslash : Escape character

A Backslash preserved the literal value of the following character

Exception for which is interpreted as a line continuation -> It cannot serve as token separation.

$ echo toto\
> titi
tototiti

Single-Quotes

Single-quotes preserve the literal value of all characters contained inside them.

Double-quotes

$

Keeps it meaning of parameter expansion, command substitution and arithmetic expansion.

Characters between $( and ) are not affected by the double-quotes and only define the command that replaces the $(…) (Token recognition rules are applied to find mathing ‘)’)

Within the characters between ${ and } can be an even number of unescaped double-quotes or single-quotes. A backslash can escape a ‘{’ or a ‘}’

`

Retains its special meaning. These cases have undefinied results :

\

Retains its special meaning only when followed by these characters :

3 Token Recognition

Lines are parsed using two modes : ordinary token recognition and processing of here-documents

When an io_here is recognized, subsequent lines are parsed according to rules of Here Document.

Else :

Token Recognition rules

  1. If the end-of-output is reached, the current token is delimited. If there is no current token, the end-of-input indicator is returned as a token.

  2. If previous character was used as part of an operator and current character is not quoted and can be used within current recognized characters to form an operator, if is used as part of that operator token.

  3. If previous character was used as part of an operator and current character cannot be used within the current recognized character to form an operator, the operator token containing the previous character is delimited.

  4. If current character is a ', " or \ and is not quoted, it affects quoting for the following characters up to the end of quoted text.

  5. If character is a $ or a ` and is not quoted, identify the start of parameter expansion, command subsitution or arithmetic expansion.

  6. If current charater is not quoted and can be the beginning of a new operator, current token is delimited and current character is used as the beginning of a new operator token

  7. If current character is an unquoted , current token is delimited

  8. If current character is an unquoted , the token containing previous character is the delimied and current character is discarded

  9. If previous character is part of a word, current character is appended to that word

  10. If current character is a ‘#’, all following characters (exluding ) are discarded as comment

  11. The current character is used as the start of a word

Alias subsitution

After a token is delimited, but before applying the Shell Grammar, a word that is identified as a command name has to be validated as a valid alias name.

Avoid recursive aliasing If shell not currently processing an alias of the same name, replace the word by the value of the alias, else do not replace it

4 Reserved Words

! { } case do done elif else esac fi for if in then until while

Reserved words are recognized only when not quoted and when they are used as :

These are recognized as reserved words on specific implementations

[[ ]] function select

Words that are the concatenation of a name and a ‘:’ are reserved.

5 Parameters and Variables

Parameter : A name, a number or one special character Variable : A parameter denoted by a name

Positional Parameters

Special Parameters

The special parameters and the values to which they extend.

@

*

NULL is a valid value : parameters will be concatenated

#

?

-

$

0

Shell Variables

Environment variables

ENV

HOME

IFS

LANG

LC_ALL

LC_COLLATE

LC_CTYPE

LC_MESSAGES

LINENO

NLSPATH

PATH

PPID

PS1

PS2

PS4

PWD

6 Word expansion

Order of word expansion

  1. Tilde expansions, parameter expansions, command substitutions, arithmetic expansions (beginning to end)
  2. Field splitting on the fields generated by 1.
  3. Pathname expansion
  4. Quote removal (always performes last)

If complete expansion is an empty string field, remove this field from the list, unless expanded within quotes

Tilde expansion

Tilde prefix : consists of the characters from an unquoted ~ up to the first ‘/’

~/foo $HOME/foo

~login_x/foo The subdirectory foo of the home directory of the user login_x

Parameter expansion

${expression}

expression : All characters until marhcing ‘}’, that does not include escaped or quoted ‘}’

In parameter values (’${parameter}’), braces are optionnal except for :

When no braces enclose parameter, expansion uses the longest name, whether it exists or not

Parameter expansion within double quotes

Special formats

${parameter:-word}

If foo is set foo=baz; ${ foo-bar} xyz} bazxyz}

If foo is unset ${ foo-bar} xyz} barxyz}

${parameter:=word}

unset X echo ${X:=abc} abc

${parameter:?word}

unset posix echo ${posix:?} sh: posix: parameter null or not set

${parameter:+word}

With a colon : test if the parameter is unset or NULL Without colon : test if parameter is only unset

${#parameter}

Pattern matching on parameters

Where word is expanded to produce a pattern

${parameter%word}

x=file.c echo ${x%.c}.o file.o

${parameter%%word}

x=posix/src/std echo ${x%%/*} posix

${parameter#word}

x=$HOME/src/cmd echo ${x#$HOME} /src/cmd

${parameter##word}


Command substitution

Command substitution : The output of a command is substituted in place of the command itself

Format : $(command) or `command`

Within backquoted command substitution, \ keeps its literal meaning except before a , a backquote or a $

Arithmetic expansion

Arithmetic expansion : Mechanism to evaluate an arithmetic expression and substitute its value

Format : $((expression))

Changes to a variable in an arithmetic expression also takes affect after the expression as in “${x=value}”

If shell variable x contains an integer constant, “$((x))” == "$(($x))"

Field Splitting

Occurs after parameter expansion, command substitution and arithmetic expression

Shell uses each character of the IFS as a delimiter and split the result of parameter expansion and command substitution into fields

  1. IFS is NULL : No field splitting
  2. IFS is unset or , or , sequence of these characters are ignored at the beginning and end of input and delimit fields within the input

foobar Delimits two fields : foo and bar

  1. Otherwise, a. IFS white space ignored at beginning and end of input b. Ocurrence of an IFS character that is not IFS white space along with an IFS white space delimits a field c. IFS white spaces delimit a field

    IFS white space : any sequence of white space characters in the IFS value (, etc)

Pathname expansion

Occurs after field splitting, if set -f if not in effect

Fields in command line are expanded using algorithm in Pattern Matching Notation (13)

Quote removal

Quote characters , ‘’ and “” are removed unless they are themselves quoted


7 Redirection

Redirection format [n]redir-op word

With n the optionnal file descriptor

n should not be quoted echo \2>a : writed “2” into file a echo 2>a : writes 2>a to stdout

0: stdin 1: stdin 2: stderr

Redirecting Input

[n]<word

The file which is the expansion of word is opened for reading on the n file descriptor if there is a number, or stdin

Redirecting Output

[n]>word [n]>|word

The file which is the expansion of word is created and opened for output on n file descriptor, or stdout is no number

File truncated if it already exists

Appending redirected Output

[n]>>word

File word opened for output with as if it was called with an O_APPEND flag

Here-Documents

<< and <<- allow redirection of lines in “here-documents” to the input of a command

[n]<<word here-document delimiter

n represents a file descriptor or stdin if no number specified

<<- : <tab>s stripped from input lines

cat <<eof1; cat <<eof2 heredoc> Hi, heredoc> eof1 heredoc> Helene. heredoc> eof2

result: Hi, Helene.

Duplication of Input File Descriptor

[n]<&word

Duplication of Output File Descriptor

[n]>&word

Same as above, but with stdout as default file descriptor when n is not specified

Open fd for Reading or Writing

[n]<>word

File ‘word’ opened for reading and writing on fd n (or stdin if n not specified)

File created if it does not exists

8 Exit Status and Errors

Consequences of Shell Errors

Errors encountered with builtin commands write an error message to stderr and

Error Builtins Other
Shell language syntax error Shall exit Shall exit
Utility syntax error (option or operand error) Shall exit Shall not exit
Redirection error Shall exit Shall not exit
Variable assignment error Shall exit Shall not exit
Expansion error Shall exit Shall exit
Command not found N/A May Exit
Dot script not found Shall exit N/A

Shall exit or May exit errors in a subshell cause the subshell to exit with non-zero status, but does not exit script containing subshell

Exit status for Commands

9 Shell Commands

Commands are :

Exit status of a command = exit status of last simple command executed by the command

Simple commands

Sequence of (optional) variable assignments and redirections, followed by (optional) words and redirections. Terminated by control operator (’;’ or <newline>)

Before the simple command is executed, the following are performed :

  1. Variables assignments and redirections are saved for steps 3 and 4
  2. The other words are expanded. First field is considered as command name and the others are the arguments
  3. Perform redirections
  4. Variable assignments expanded before assignment

Steps 3 and 4 are reversed for builin lists

No command name :

Command error and Execution

Operations when a simple command results in a command name and a (optionnal) list of arguments

  1. No slashes in command name i. Search successful (path may be saved) a. Command name is a builtin : builtin is called b. Command name is a function known to shell : it is called (as if with execve()) ii. Unsuccessful, exit 127
  2. At least one slash in name a. Executed in another environment

execve fails with ENOEXEC, shells invoked with command name as first argument

Pipelines

[!] command1 [ | command2 …]

stdout of first command becomes stdin of second command

If there is a ‘!’, exit status is the logical NOT of exit status of last command

Lists

AND-OR list : sequence of one or more pipelines separated by && or ||

list : sequence of one or more AND-OR lists separated by ‘;’ or ‘&’

&& and || have equal precedence (left associativity)

coumpound list : sequence of lists separated by <newline>s, and optionnaly preceded and followed by <newline>s

Asynchronous lists

command1 & [command2 & … ]

Asynchronous : shell does not wait for the command to finish before executing next command.

PID of the last command in the asynchronous list is known until

  1. A command of the lists terminates
  2. Another asynchronous list is expanded in the current execution environment

Exit status : 0

Sequential lists

command1 [; command2] …

AND lists

command1 [ && command2] …

command1 is first executed. If its exit status is zero, command2 is executed, etc until a command has a non-zero exit or there are no more commands in the list

OR lists

command1 [ || command2] …

command1 executed. If its exit status is non-zero, command2 is executed etc until a command returns a zero exit status

Coumpound commands

Coumpound commands can be followed by redirections > each one apply to all commands within coumpound command

Grouping commands

(compound list)

{ compound-list;}

; is a delimiter to the } reserved word

Exit status of grouping commands are the ones of the compound-list

For loop

for name [ in [word ... ]]do
    compound-list

done

Omitting ‘in word…’ is equivalent to ‘in “$@”’

Exit status of the for loop if the one of the last command it executes.

Case condition

case word in
    [(]pattern1) compound-list;;
    [[(]pattern[ | pattern] ... ) compound-list;;] ...
    [[(]pattern[ | pattern] ... ) compound-list]


esac

;; is optional for the last one

Executes compound-list corresponding to the first of several patterns matched by ‘word’

Exit status if zero if no patterns matched, or the one of the last command executed

If condition

if compound-listthen
    compound-list[elif compound-listthen
    compound-list] ...
[else
    compound-list]


fi

Its exit status is the one of the last then or else compound list executed, or 0

While loop

while compound-list-1do
    compound-list-2


done

Exit status is the one of the last compound-list-2 executed, or 0

Until loop

until compound-list-1do
    compound-list-2


done

Exit status of until loop is the last compound-list-2 executed, or 0

Function Definition Command

Function : user-defined name used as simple command to call compound command with new parameters

Format of a function definition command :

fname() compound-command[io-redirect …]

Exit status is zero if function successfully declared, or >0 otherwise

10 Shell Grammar

Shell grammar lexical conventions

Input first recognized at character level

  1. A <newline> returns token NEWLINE
  2. An operator returns the token of that operator
  3. Digits with delimiter ‘>’ or ‘<’ returns token IO_NUMBER
  4. Otherwise, TOKEN is returned

When more than one rule can apply, the highest number one is applied

WORD tokens have to be expanded before execution

Shell grammar rules

  1. [Command name]
  1. [Redirection to/from filename]
  1. [Redirection from here-documents]

  2. [Case statement termination]

  1. [NAME in for]
  1. [Third word of for and case] a. case only - TOKEN is in, token for in is returned b. for only - TOKEN exactly in or do, return the token for that reserved word

linebreak has te precede in and do

  1. [Assignment preceding command name] a. When first word - TOKEN does not contain ‘=’, Rule 1 applied - Else 7b applies b. Not the first word - Begins with ‘=’, TOKEN returned - Chars before ‘=’ are a valid name, return ASSIGNMENT_WORD - Otherwise, unspecified

  2. NAME in function

  1. [Body of funcion]

11 Signals and Error Handling

Command in asynchronous list : No SIGQUIT and SIGINT from keyboard can stop command

12 Shell execution environment

Shell execution environment :

Utilities other than builtins :

Subshell duplicates of shell environment (but traps are not default traps)

Command substitution, commands grouped w/ parenthesis and asynchronous lists > executed in subshell environments

13 Pattern matching notation

14 Special built-in Utilities

built-in : Shell can execute the utility directly (no search)

In special built-ins :