Shell Scripting AWK Advanced AWK Advanced May 2026

Shell Scripting — Advanced AWK: Fundamentals

Master AWK's execution model, pattern-action blocks, built-in variables (NR, NF, FS, OFS, RS), field manipulation, string functions, arithmetic, and the BEGIN/END blocks that make AWK a complete text processing language.

AWK is not just a text tool — it is a complete programming language designed specifically for processing structured text. Unlike grep (which filters) or sed (which transforms), AWK processes data field by field, record by record, with full arithmetic, string functions, arrays, and control flow. Understanding AWK deeply is what separates advanced shell scripters from the rest.

1 AWK execution model — pattern { action }

AWK

# AWK processes input one record (line) at a time
# Each record is split into fields: $1, $2, ... $NF
# $0 = entire record, NR = record number, NF = number of fields
# FS = field separator (default: whitespace)

# ── Structure: [pattern] { action } ──────────────────────
awk '{ print $1 }'                       # action only (runs for every line)
awk '/ERROR/ { print $0 }'              # pattern + action (regex match)
awk 'NR == 1 { print "Header:", $0 }'  # expression pattern
awk '/START/,/END/ { print }'            # range pattern

# ── BEGIN and END blocks ──────────────────────────────────
awk '
BEGIN {
  FS = ","            # set field separator before any input
  print "Processing CSV..."
  count = 0
}
NR > 1 {             # skip header row
  count++
  total += $3        # sum column 3
}
END {
  print "Records:", count
  print "Total:", total
  print "Average:", (count > 0 ? total/count : 0)
}' data.csv

# ── Multiple rules match the same line ───────────────────
awk '
/ERROR/ { errors++ }
/WARN/  { warns++ }
END     { printf "Errors: %d, Warnings: %d\n", errors, warns }
' app.log

2 Built-in variables — the AWK environment

AWK

# ── Input variables ───────────────────────────────────────
# NR    current record number (total lines read)
# FNR   record number within current file
# NF    number of fields in current record
# $0    full current record
# $1..$NF  individual fields
# FILENAME  current input filename

# ── Separator variables ───────────────────────────────────
# FS    field separator (default: whitespace; set in BEGIN)
# OFS   output field separator (default: space)
# RS    record separator (default: newline)
# ORS   output record separator (default: newline)

# ── Examples ──────────────────────────────────────────────
awk '{ print NR, NF, $1, $NF }' file.txt

# Process CSV — comma separator
awk -F',' '{ print $2 }' data.csv
awk 'BEGIN{FS=","} { print $2 }' data.csv   # same

# Output with custom separator
awk 'BEGIN{FS=":"; OFS="\t"} { print $1, $3, $7 }' /etc/passwd

# Multi-character delimiter
awk -F'  *' '{ print $1 }'             # 2+ spaces as delimiter
awk -F'[,;|]' '{ print $2 }'           # regex delimiter

# Process records separated by blank lines (paragraphs)
awk 'BEGIN{RS=""} { print NR, $0 }' file.txt

# $NF — last field (very useful)
ls -la | awk '{ print $NF }'            # print filename only
df -h   | awk 'NR>1 { print $NF, $5 }' # mount point + use%

3 String functions — the AWK stdlib

AWK

# ── String functions ──────────────────────────────────────
awk '{ print length($0) }'                  # line length
awk '{ print length($1) }'                  # field 1 length

awk '{ print toupper($1) }'                 # uppercase
awk '{ print tolower($1) }'                 # lowercase

awk '{ print substr($0, 1, 10) }'           # first 10 chars
awk '{ print substr($1, 5) }'               # from position 5 to end

awk '{ print index($0, "ERROR") }'          # position of substring (0=not found)

awk '{ if (split($0, arr, ":") > 2) print arr[1], arr[3] }'
# split("str", array, sep) → fills array, returns count

# ── sub and gsub — find and replace ──────────────────────
awk '{ sub(/ERROR/, "WARN"); print }'        # replace first match
awk '{ gsub(/ERROR/, "WARN"); print }'       # replace all matches
awk '{ gsub(/ +/, "_"); print }'             # spaces → underscores
awk '{ gsub(/[^0-9]/, ""); print }'          # keep digits only

# ── printf — formatted output ─────────────────────────────
awk '{ printf "%-20s %8.2f\n", $1, $2 }'    # aligned columns
awk '{ printf "%05d %s\n", NR, $0 }'        # zero-padded line numbers

# ── match — regex matching with position ─────────────────
awk '{ if (match($0, /[0-9]+\.[0-9]+\.[0-9]+\.[0-9]+/))
         print substr($0, RSTART, RLENGTH) }'  # extract IP

4 AWK arithmetic and control flow

AWK

# ── Arithmetic ────────────────────────────────────────────
awk '{ sum += $1; count++ } END { print sum/count }'
awk 'BEGIN { print 355/113 }'               # pi approximation: 3.14159
awk '{ print int($1 * 1.21) }'              # integer truncation
awk 'BEGIN { print sqrt(2), log(10), sin(3.14) }'

# ── Conditionals ──────────────────────────────────────────
awk '$3 > 80 { print "HIGH:", $0 }'
awk '{ if ($2 == "ERROR") print "error:", $3; else print "ok:", $3 }'
awk '{ status = ($3 > 90) ? "CRITICAL" : ($3 > 70 ? "WARN" : "OK"); print status, $1 }'

# ── Loops ─────────────────────────────────────────────────
awk '{ for (i=1; i<=NF; i++) printf "%s ", $i; print "" }'
awk 'BEGIN { for (i=1; i<=10; i++) print i*i }'
awk '{ i=1; while (i <= NF) { print $i; i++ } }'

# ── next — skip to next record ────────────────────────────
awk '/^#/ { next } { print }'              # skip comment lines
awk 'NF == 0 { next } { print }'           # skip empty lines

# ── exit — stop processing ────────────────────────────────
awk 'NR==5 { exit } { print }'             # print first 4 lines only

Terminal output

Key Computed output Field extraction Aggregation

awk — fundamentals demo

vriddh@prod-01:~/scripts$awk -F',' 'BEGIN{print "Name","CPU","Status"} NR>1{status=$3>80?"HIGH":"OK"; printf "%-15s %4s%% %s\n",$1,$3,status}' servers.csv

Name CPU Status

prod-web-01 38% OK

prod-db-01 91% HIGH

prod-cache-01 23% OK

vriddh@prod-01:~/scripts$awk 'NR>1{sum+=$3; n++} END{printf "Avg CPU: %.1f%%\n", sum/n}' servers.csv

Avg CPU: 50.7%

█

✔ AWK fundamentals — Every AWK program is zero or more pattern { action } rules. Use BEGIN to set FS and print headers. Use END for summaries. $0 = full line, $1–$NF = fields, $NF = last field. Always set FS in BEGIN rather than relying on -F when writing to a file. Use printf for aligned output — print is too uncontrolled for reports.