CSV and delimited files are everywhere in DevOps — inventory exports, billing data, database dumps, configuration lists. Bash handles them well when you know the right tools. awk is the most powerful for column-aware processing; while IFS= read is the safest for line-by-line parsing with proper field splitting.
1
Reading CSV with while read and IFS
BASH
# CSV file: servers.csv
# name,host,port,role
# db-primary,10.0.1.10,3306,primary
# db-replica,10.0.1.11,3306,replica
# cache-01,10.0.1.20,6379,cache
# ── Read CSV, skip header ─────────────────────────────────
while IFS=',' read -r name host port role; do
[[ "${name}" == "name" ]] && continue # skip header row
[[ "${name}" == "#"* ]] && continue # skip comments
[[ -z "${name}" ]] && continue # skip blank lines
printf " Checking %-15s %-15s %-6s %s\n" \
"${name}" "${host}" "${port}" "${role}"
nc -z -w 3 "${host}" "${port}" 2>/dev/null \
&& echo " ✔ UP" || echo " ✘ DOWN"
done < servers.csv
# ── Skip header with tail ─────────────────────────────────
tail -n +2 servers.csv | while IFS=',' read -r name host port role; do
echo " Processing: ${name}"
done
# ── TSV (tab-delimited) ───────────────────────────────────
while IFS=$'\t' read -r col1 col2 col3; do
echo " ${col1} | ${col2} | ${col3}"
done < data.tsv
2
awk for column-based CSV processing
BASH
# ── awk with field separator ──────────────────────────────
awk -F',' '{print $1, $3}' servers.csv # name and port
awk -F',' 'NR>1 {print $2}' servers.csv # host column, skip header
# ── Filter rows ───────────────────────────────────────────
awk -F',' '$4=="primary"' servers.csv # only primary servers
awk -F',' '$3>3000' servers.csv # port > 3000
# ── Sum a column ──────────────────────────────────────────
awk -F',' 'NR>1 {sum+=$3} END {print "Total:", sum}' data.csv
# ── Report with header ────────────────────────────────────
awk -F',' '
BEGIN {
printf "%-20s %-15s %s\n", "Name", "Host", "Port"
printf "%-20s %-15s %s\n", "----", "----", "----"
}
NR>1 {
printf "%-20s %-15s %s\n", $1, $2, $3
}
END {
printf "\nTotal: %d servers\n", NR-1
}' servers.csv
# ── Group by and count ────────────────────────────────────
awk -F',' 'NR>1 {count[$4]++} END {for (r in count) print r, count[r]}' \
servers.csv
3
Generate CSV output from scripts
BASH
#!/usr/bin/env bash
# disk_report.sh — Generate CSV report of disk usage
OUTPUT="/tmp/disk_report_$(date +%Y%m%d).csv"
# Write header
echo "hostname,filesystem,size,used,available,use_pct,mounted_on" > "${OUTPUT}"
HOSTNAME=$(hostname)
# Write data rows
df -h --output=source,size,used,avail,pcent,target | grep -v tmpfs \
| tail -n +2 \
| while IFS= read -r line; do
read -r fs size used avail pct mount <<< "${line}"
printf "%s,%s,%s,%s,%s,%s,%s\n" \
"${HOSTNAME}" "${fs}" "${size}" "${used}" \
"${avail}" "${pct}" "${mount}"
done >> "${OUTPUT}"
echo "Report saved: ${OUTPUT}"
# Append data from multiple servers
for srv in web-01 web-02 db-01; do
ssh "${srv}" "df -h --output=source,size,used,avail,pcent,target | grep -v tmpfs | tail -n +2" \
| awk -v host="${srv}" '{print host","$1","$2","$3","$4","$5","$6}' \
>> "${OUTPUT}"
done
vriddh@prod-01:~/scripts$awk -F',' 'NR>1 {count[$4]++} END {for (r in count) printf " %-10s %d\n", r, count[r]}' servers.csv
primary 1
replica 2
cache 1
vriddh@prod-01:~/scripts$./disk_report.sh
Report saved: /tmp/disk_report_20260501.csv
vriddh@prod-01:~/scripts$head -3 /tmp/disk_report_20260501.csv
hostname,filesystem,size,used,available,use_pct,mounted_on
prod-01,/dev/sda1,50G,18G,30G,38%,/
prod-01,/dev/sdb1,200G,172G,18G,91%,/data
█
⚠ Quoted fields with commas — Standard CSV allows fields like
"Smith, John",42. The IFS=',' pattern breaks on this. For production CSV with quoted fields, use Python's csv module or mlr (Miller) which handles RFC 4180 properly. For simple CSV without quoted commas, bash + awk works perfectly.✔ CSV processing rules — Always skip the header row with
NR>1 in awk or tail -n +2. Always skip blank lines and comments. Use IFS=',' with while read for simple parsing. Use awk -F',' for filtering, aggregation, and reporting. Always quote CSV output fields that could contain commas. Use mlr or Python for complex RFC 4180 CSV.