Saturday, March 22, 2014

Bash Redirection and Piping Shortcuts

Redirecting both stdout and stderr

In order to redirect both standard output and standard error to a file, you would traditionally do this:

my_command > file 2>&1
A shorter way to write the same thing is by using &> (or >&) as shown below:
my_command &> file

Similarly, to append both standard output and standard error to a file, use &>>:

my_command &>> file
Piping both stdout and stderr

To pipe both standard output and standard error, you would traditionally do this:

my_command 2>&1 | another_command

A shorter way is to use |& as shown below:

my_command |& another_command

Other posts you might like:
Shell Scripting - Best Practices
All posts with label: bash

Saturday, March 08, 2014

Use sqsh, not isql!

Sqsh is a sql shell and a far superior alternative to the isql program supplied by Sybase. It's main advantage is that it allows you to combine sql and unix shell commands! Here are a few reason why I love it:

1. Pipe data to other programs
You can create a pipeline to pass SQL results to an external (or unix) program like less, grep, head etc. Here are a few examples:

# pipe to less to browse data
1> select * from data; | less

# a more complex pipeline which gzips data containing a specific word
2> select * from data; | grep -i foo | gzip -c > /tmp/foo.gz

# this example shows the use of command substitution
3> sp_who; | grep `hostname`

2. Redirect output to file
Just like in a standard unix shell, you can redirect output of a sql command to file:

# write the output to file
1> sp_helptext my_proc; > /tmp/my_proc.txt

3. Functions and aliases
You can define aliases and functions in your ~/.sqshrc file for code that you run frequently. Some of mine are shown below. (Visit my GitHub dotfiles repository to see my full .sqshrc.)

\alias h='\history'

# shortcut for select * from
\func -x sf
    \if [ $# -eq 0 ]
        \echo 'usage: sf "[table [where ...]]"'
        \return 1
    \fi
    select * from $*; | less -F
\done

# count rows in a table
\func -x count
    \if [ $# -eq 0 ]
        \echo 'usage: count "[table [where ...]]"'
        \return 1
    \fi
    select count(*) from $*;
\done
You can invoke them like this:
# select * from data table
1> sf "data where date='20140306'"

# count the rows in the employee table
2> count employee

# list aliases
3> \alias
4. History and reverse search

You can rerun a previous command by using the \history command or by invoking reverse search with Ctrl+r:

1> \history
(1) sp_who
(2) select count(*) from data
(3) select top 10 * from data

# invoke the second command from history
2> !2

# invoke the previous command
3> !!

# reverse search
4> <Ctrl+r>
(reverse-i-search)`sp': sp_who
4> sp_who

5. Customisable prompt
The default prompt is ${lineno}> , but it can be customised to include your username and database, and it even supports colours. It would be nice if there was a way to change the colour based on which database you were connected to (for example, red for a production database), but I haven't been able to figure out if this is possible yet. Here is my prompt, set in my ~/.sqshrc:

\set prompt_color='{1;33}' # yellow
\set text_color='{0;37}'   # white
\set prompt='${prompt_color}[$histnum][$username@$DSQUERY.$database] $lineno >$text_color '

6. Different result display styles
sqsh supports a number of different output styles which you can easily switch to. The ones I frequently use are csv, html and vert (vertical). Here is an example:

1> select * from employee; -m csv
123,"Joe","Bloggs"

2>select * from employee; -m vert
id:        123
firstName: Joe
lastName:  Bloggs

7. For-loops
A for-loop allows you to iterate over a range of values and execute some code. For example, if you want to delete data, in batches, over a range of dates, you can use a for-loop like this:

\for i in 1 2 3 4 5
    \loop -e "delete from data where date = '2014020$i';"
    \echo "Deleted 2014020$i"
\done

8. Backgrounding long-running commands
If you have a long-running command, you can run it in the background by putting an & at the end of the command. You can then continue running other commands, whilst this one runs in the background. You will see a message when the background command completes and you can use \show to see the results. Here is an example:

# run a command in the background
1> select * from data; &
Job #1 running [6266]

Job #1 complete (output pending)

# show the results of the backgrounded command
3> \show 1

Further information:
You can download sqsh here and then read the man page for more information.
You can take a look at my .sqshrc in my GitHub dotfiles repository.

Sunday, February 23, 2014

Using "lockfile" to Prevent Multiple Instances of a Script from Running

This post describes how you can ensure that only one instance of a script is running at a time, which is useful if your script:

  • uses significant CPU or IO and running multiple instances at the same time would risk overloading the system, or
  • writes to a file or other shared resource and running multiple instances at the same time would risk corrupting the resource

In order to prevent multiple instances of a script from running, your script must first acquire a "lock" and hold on to that lock until the script completes. If the script cannot acquire the lock, it must wait until the lock becomes available. So, how do you acquire a lock? There are different ways, but the simplest is to use the lockfile command to create a "semaphore file". This is shown in the snippet below:

#!/bin/bash
set -e

# waits until a lock is acquired and
# deletes the lock on exit.
# prevents multiple instances of the script from running
acquire_lock() {
    lock_file=/var/tmp/foo.lock
    echo "Acquiring lock ${lock_file}..."
    lockfile "${lock_file}"
    trap "rm -f ${lock_file} && echo Released lock ${lock_file}" INT TERM EXIT
    echo "Acquired lock"
}

acquire_lock
# do stuff

The acquire_lock function first invokes the lockfile command in order to create a file. If lockfile cannot create the file, it will keep trying forever until it does. You can use the -r option if you only want to retry a certain number of times. Once the file has been created, we need to ensure that it is deleted once the script completes or is terminated. This is done using the trap command, which deletes the file when the script completes or when the shell receives an interrupt or terminate signal. I also like to use set -e in all my scripts, which makes the script exit if any command fails. In this case, if lockfile fails, the script will exit and the trap will not be set.

lockfile can be used in other ways as well. For example, instead of preventing multiple instances of the entire script from running, you may want to use a more granular approach and use locks only around those parts of your script which are not safe to run concurrently.

Note, that if you cannot use lockfile, there are other alternatives such as using mkdir or flock as described in BashFAQ/045.

Other posts you might like:
Shell Scripting - Best Practices
Retrying Commands in Shell Scripts
Executing a Shell Command with a Timeout

Saturday, February 08, 2014

Retrying Commands in Shell Scripts

There are many cases in which you may wish to retry a failed command a certain number of times. Examples are database failures, network communication failures or file IO problems.

The snippet below shows a simple method of retrying commands in bash:

#!/bin/bash

MAX_ATTEMPTS=5
attempt_num=1
until command || (( attempt_num == MAX_ATTEMPTS ))
do
    echo "Attempt $attempt_num failed! Trying again in $attempt_num seconds..."
    sleep $(( attempt_num++ ))
done

In this example, the command is attempted a maximum of five times and the interval between attempts is increased incrementally whenever the command fails. The time between the first and second attempt is 1 second, that between the second and third is 2 seconds and so on. If you want, you can change this to a constant interval or random exponential backoff instead.

I have created a useful retry function (shown below) which allows me to retry commands from different places in my script without duplicating the retry logic. This function returns a non-zero exit code when all attempts have been exhausted.

#!/bin/bash

# Retries a command on failure.
# $1 - the max number of attempts
# $2... - the command to run
retry() {
    local -r -i max_attempts="$1"; shift
    local -r cmd="$@"
    local -i attempt_num=1

    until $cmd
    do
        if (( attempt_num == max_attempts ))
        then
            echo "Attempt $attempt_num failed and there are no more attempts left!"
            return 1
        else
            echo "Attempt $attempt_num failed! Trying again in $attempt_num seconds..."
            sleep $(( attempt_num++ ))
        fi
    done
}

# example usage:
retry 5 ls -ltr foo

Related Posts:
Executing a Shell Command with a Timeout
Retrying Operations in Java

Saturday, January 25, 2014

Coursera class: Principles of Reactive Programming

A few weeks ago, I completed the "Principles of Reactive Programming" class led by Martin Odersky, Erik Meijer and Roland Kuhn. This Coursera class started in November 2013 and was around 7 weeks long. It was a great class in which we learnt how to write reactive programs in Scala. The course mainly covered Futures, Promises, Observables, Rx streams and Akka Actors. It was quite challenging but the assignments were very enjoyable. We wrote a virus simulation and a wikipedia suggestions app!

Related posts:
Coursera class: Functional Programming Principles in Scala
Stanford's Online Courses: ml-class, ai-class and db-class