This lesson is being piloted (Beta version)

BASH Programming for Workflow Management: Glossary

Key Points

Dates, Scheduling, and Downloading Files
  • Using date for your date and time calculations will save you a lot of hassle

  • crontab can be used for scheduling regular tasks

  • wget can be used for scripting your data download process

  • Tool syntax is not always consistent across different unix flavours

Variables and Arrays
  • BASH variables can store a single piece of information

  • BASH arrays can store an indexed lists of information

  • {} denotes a code block, and are essential for referencing arrays

  • [@] denotes all of an array, while [X] denotes the value at position X

  • ${#VAR} returns the length of the string

  • ${#ARRAY[@]} returns the number of items in the array

Subshells and Functions
  • environment variables are accessible by all programs run from that shell

  • export turns a (private) shell variable into an environment variable

  • () creates a subshell

  • {} creates a code block within the current shell

  • $() allows a subshell to be used for command substitution, for saving the output as a variable

  • <() allows a subshell to be used for process substitution, for passing the output to another program

  • read var1 var2 <<<$() can be used to save more than one output from a command substitution

  • function NAME() {} creates a function code block

  • code inside functions are not executed on creation, and can be used repeatedly after creation

  • . script.sh and source script.sh enable the importing of code and variables from other scripts

BASH Logic and Maths
  • (( )) is math context, and enables the use of C’s integer arithmetic operators

  • $(( )) can be used in the same way as a command substitution

  • bash interprets 0X strings as base 8, prefix strings or variables with 10# to force base 10

  • if statements can use both (( )) and [[ ]] commands for expression testing. These use different syntax, so be careful to check your code!

Advanced Loops
  • seq [first] [increment] last creates a sequence of (real) numbers

  • for loops can be controlled using seq or C-style notation

  • ${#array[@]} is useful for setting these sequences

  • while loops use conditional statements, and aren’t fixed in length like for loops

  • while loops cna be used for process control

Sed and regular expressions
  • sed performs basic text transformations on an input stream

  • The basic usage is sed -e 's/pattern/replacement/' input.txt

  • Multiple scripts can be chained, by using additional -e 's/pattern/replacement/' declarations

  • Matches will be made on the first instance of the pattern, or all matches can be found by using s/pattern/replacement/g

  • Extended regular expressions can be enabled with the -E flag

  • Specify character ranges using [A-Z0-9]

  • Repeat single characters or ranges by appending *, +, ?, or {RANGE}

  • Match the start and end of lines using ^ and $, respectively

  • Special character can be matched if they are escaped by prepending \

  • Capture subexpressions with ( ), and back-reference in your pattern or replacement text these using \1-\9

  • Regex can be used in logic tests, with the =~ operator

  • Regex are easier to write than to read. Document yours well!

Symbolic Links
  • Symbolic links to objects (files or directories) can be created using ln -s

  • These are links to the object, not it’s contents, so these can change or be deleted

  • Symbolic links can cross physical disks, and so are useful in networked filesystems

  • Caution must be exercised when following .. paths across symbolic links

  • They are most useful for linking to, and/or renaming, input and configuration files or directories

Glossary

Stuff in a glossary.