This simple awk cheat sheet is far away from being complete and was created as a note for myself. Maybe it also helps you if you only need awk from time to time and can’t remember all of the usage. I use only one-liner on this page although awk also supports writing huge scripts which can be called with awk -f SCRIPT.

Please notice that this page is not completed yet.

Working material
For playing around with awk and reproducing the examples mentioned below you will need some material to work with. In some cases we will use your local /etc/passwd file, otherwise we will have a look at these two text files below.

File called “coins” (stolen from http://www.vectorsite.net/tsawk_1.html#m2):

gold 1 1986 USA American Eagle
gold 1 1908 Austria-Hungary Franz Josef 100 Korona
silver 10 1981 USA ingot
gold 1 1984 Switzerland ingot
gold 1 1979 RSA Krugerrand
gold 0.5 1981 RSA Krugerrand
gold 0.1 1986 PRC Panda
silver 1 1986 USA Liberty dollar
gold 0.25 1986 USA Liberty 5-dollar piece
silver 0.5 1986 USA Liberty 50-cent piece
silver 1 1987 USA Constitution dollar
gold 0.25 1987 USA Constitution 5-dollar piece
gold 1 1988 Canada Maple Leaf

File called “ldif”:

dn: cn=Robert Smith,ou=people,dc=example,dc=com
objectclass: inetOrgPerson
cn: Robert Smith
cn: Robert J Smith
cn: bob smith
sn: smith
uid: rjsmith
userpassword: rJsmitH
carlicense: HISCAR 123
homephone: 555-111-2222
mail: r.smith@example.com
alias: rsmith@example.com
alias: bob.smith@example.com
description: nice hair
ou: Human Resources

dn: cn=John Doe,ou=people,dc=example,dc=com
objectclass: inetOrgPerson
cn: John Doe
cn: John Walker Doe
cn: Johnny
sn: Doe
uid: jdoe
userpassword: topsecret
carlicense: AKAHH 123
homephone: 123-458-362
mail: j.doe@example.com
alias: jdoe@example.com
alias: john.doe@example.com
description: cool guy
ou: Sales

Searching with awk
Search the input for a specific string:

awk '/gold/' coins
gold 1 1986 USA American Eagle
gold 1 1908 Austria-Hungary Franz Josef 100 Korona
gold 1 1984 Switzerland ingot
gold 1 1979 RSA Krugerrand
gold 0.5 1981 RSA Krugerrand
gold 0.1 1986 PRC Panda
gold 0.25 1986 USA Liberty 5-dollar piece
gold 0.25 1987 USA Constitution 5-dollar piece
gold 1 1988 Canada Maple Leaf

Match on lines beginning with a specific word:

awk '/^mail:/' ldif
mail: r.smith@example.com
mail: j.doe@example.com

Find lines beginning with a specific word and containing a predefined string:

awk '/^mail:.*doe.*/' ldif
mail: j.doe@example.com

Find a string under specific conditions (sorry for the screwed greater-than character):

awk '{if ($3 > 1985) print "Type: " $1 " Year: " $3}' coins
Type: gold Year: 1986
Type: gold Year: 1986
Type: silver Year: 1986
Type: gold Year: 1986
Type: silver Year: 1986
Type: silver Year: 1987
Type: gold Year: 1987
Type: gold Year: 1988

Match only a specific field, e.g. show only the coins which were made in the USA:

awk '$4 ~ /USA/' coins
gold     1    1986  USA                 American Eagle
silver  10    1981  USA                 ingot
silver   1    1986  USA                 Liberty dollar
gold     0.25 1986  USA                 Liberty 5-dollar piece
silver   0.5  1986  USA                 Liberty 50-cent piece
silver   1    1987  USA                 Constitution dollar
gold     0.25 1987  USA                 Constitution 5-dollar piece

Print only some fields of the input. This can also be done with grep or sed:

awk '/silver/ {print "Type:", $1, " Country:" , $4}' coins
Type: silver Country: USA
Type: silver Country: USA
Type: silver Country: USA
Type: silver Country: USA

Calucations with awk
It is possible to count the lines of the text input:

awk 'END {print NR, "coins"}' coins
13 coins

Simple additions of a specific field: First match the lines containing “gold”, then summarize their value of the second field.

awk '/gold/ {sum += $2} END {print sum}' coins
6.1

Another approach is matching a specific field and counting how often it appears:

awk 'BEGIN { count=0;} $4 ~ /USA/ { count ++; } END {print "Number of coins from the USA =",count;}' coins
Number of coins from the USA = 7

Display the number of fields:

awk '{total +=NF }; END {print total}' coins
78

You receive a different result for counting the fields of a file when you define a field separator. First count the fields in your /etc/password file the “normal” way, after that use:

awk -F: '{total +=NF }; END {print total}' /etc/passwd
252

Another way to count the matches (here: count the amount of users which have /bin/bash as default shell):

awk -F: '$NF ~ /\/bin\/bash/ { n++ }; END { print n }' /etc/passwd
3

Print the length of the longest input line:

awk ' { if (length($0) > max) max = length($0) } END {print max}' ldif
47

Print all lines which are longer than 40 characters:

awk 'length($0) > 40' ldif
dn: cn=Robert Smith,ou=people,dc=example,dc=com
dn: cn=John Doe,ou=people,dc=example,dc=com

Print fice (not so) random numbers with awk:

awk 'BEGIN { for (i = 1; i <= 5; i++) print int(101 * rand()) }'
24
29
85
15
59

Variables in awk
How to store one specific values of the text input in a variable and print each result only once:

awk '/^mail:/ { mail = $2; print mail };' ldif
r.smith@example.com
j.doe@example.com

Interaction of awk with the shell
Using the values of shell variables in awk:

valentin@mintbox ~/awk $ echo $TERM
xterm
valentin@mintbox ~/awk $ echo |awk -v myvar=$TERM '{print "The value of the selected shell variable is: " myvar}'
The value of the selected shell variable is: xterm

Creating shell variables with awk:

valentin@mintbox ~/awk $ eval $(awk 'BEGIN { print "myvar=great" }')
valentin@mintbox ~/awk $ echo $myvar
great

Other text operations with awk
Strip the first column off (sorry about the screwed lesser-than character):

awk '{for (i=2; i<=NF; i++) printf "%s ", $i; printf "\n"; }' coins
1 1986 USA American Eagle
1 1908 Austria-Hungary Franz Josef 100 Korona
10 1981 USA ingot
1 1984 Switzerland ingot
1 1979 RSA Krugerrand
0.5 1981 RSA Krugerrand
0.1 1986 PRC Panda
1 1986 USA Liberty dollar
0.25 1986 USA Liberty 5-dollar piece
0.5 1986 USA Liberty 50-cent piece
1 1987 USA Constitution dollar
0.25 1987 USA Constitution 5-dollar piece
1 1988 Canada Maple Leaf

Concatenate fields/strings:

awk '{print $1.$4}' coins
goldUSA
goldAustria-Hungary
silverUSA
goldSwitzerland
goldRSA
goldRSA
goldPRC
silverUSA
goldUSA
silverUSA
silverUSA
goldUSA
goldCanada

Remove empty lines from the input:

awk 'NF > 0' ldif
 -- output was removed --

Some real life examples with awk
Only print parts of a ldif:

 awk '/^dn:/ {print $0}; /^uid:/ {print $0}; /^mail:/ {print $0; print " "};' ldif
dn: cn=Robert Smith,ou=people,dc=example,dc=com
uid: rjsmith
mail: r.smith@example.com
 
dn: cn=John Doe,ou=people,dc=example,dc=com
uid: jdoe
mail: j.doe@example.com

Modifying the ldif and creating output for ldapmodify (here we will change the e-mail address with awk):

awk '/^dn:/ {print $0; print "changetype: modify"; print "replace: mail"}; /^mail:/ {sub(/example.com/, "domain.tld"); print $0; print " "};' ldif
dn: cn=Robert Smith,ou=people,dc=example,dc=com
changetype: modify
replace: mail
mail: r.smith@domain.tld
 
dn: cn=John Doe,ou=people,dc=example,dc=com
changetype: modify
replace: mail
mail: j.doe@domain.tld

Calculating the size of all files in the directory which were modified in April (in bytes):

LC_ALL=C ls -l |awk '$6 == "Apr" {sum += $5 } END {print sum}'
2591

Helpful sources
http://www.vectorsite.net/tsawk_1.html#m2
http://www.thegeekstuff.com/2010/01/awk-introduction-tutorial-7-awk-print-examples
http://www.hcs.harvard.edu/~dholland/computers/awk.html
http://www.cyberciti.biz/faq/linux-unix-appleosx-bsd-bash-passing-variables-to-awk/
http://www.gnu.org/software/gawk/manual/gawk.html#Foreword