Ansible OEL 8 DevOps · OEL 8 · Advanced

AnsiblePerformance Tuning Ansible

When playbooks take 30 minutes against 50 hosts, you're leaving 5× speedup on the table. The five tuning levers, the Mitogen plugin that's almost cheating, fact caching at scale, and the strategy: free pattern that unblocks slow hosts.

Ansible's defaults are conservative: 5 forks, no SSH multiplexing, no pipelining, gather facts on every run, sync hosts at every task. Against a small lab those defaults are fine. Against 50 production hosts running 200 tasks, you'll watch the playbook chew through 30 minutes of mostly idle SSH overhead.

5 levers that turn 30-min runs into 5-min runs 1. pipelining ~50% fewer SSH ops pipelining = True 2. ControlPersist reuse SSH connections ssh_args = ... 3. forks = N parallelism · default 5 25–50 typical 4. fact_caching skip setup module jsonfile / redis 5. strategy: free don't sync hosts 2-3× speedup Bonus: Mitogen 3rd-party plugin · 3-7× faster strategy: mitogen_linear Expected impact (50-host fleet, 200-task playbook) Baseline: ~28 minutes + pipelining + ControlPersist: ~14 min + forks=25: ~6 min + free + mitogen: ~3 min

Without pipelining, every task does THREE SSH operations: SFTP the module file, SSH to chmod it, SSH again to run it. With pipelining, the module is piped over the existing SSH connection's stdin and runs inline — one SSH op per task instead of three.

INI — ansible.cfg
[ssh_connection]
pipelining = True
⚠ Warning: Requires requiretty to be off in /etc/sudoers on managed nodes. OEL 8 ships with this disabled. RHEL 6 / CentOS 6 nodes need Defaults !requiretty added or pipelining will fail with a cryptic error.

OpenSSH can multiplex multiple sessions through one TCP connection. With ControlPersist, the connection stays open between tasks (configurable timeout). One TCP handshake per host for the whole playbook instead of one per task.

INI — ansible.cfg
[ssh_connection]
ssh_args = -o ControlMaster=auto -o ControlPersist=300s -o PreferredAuthentications=publickey
control_path_dir = ~/.ansible/cp

Combined with pipelining, this turns 6 SSH operations per task (3 for SFTP + 3 for new connections) into 1 operation. Free 5× speedup on its own.

Default is 5. That means at most 5 hosts get a task in parallel; the other 45 wait. Bump this to match your control node's resources:

INI — forks scaling
[defaults]
forks = 25     # default 5 → bump for any fleet >10 hosts
Fleet sizeReasonable forksConstraint
5–20 hosts10–20Control node CPU
20–100 hosts25–50SSH server MaxStartups
100–500 hosts50–100Bastion bandwidth
500+ hostsUse AWX/Tower or split inventoriesSingle-node Ansible bottlenecks
💡 Tip: Set ANSIBLE_FORKS as an env var instead of editing ansible.cfg if you want a per-run override. ANSIBLE_FORKS=50 ansible-playbook ...

The setup module (fact gathering) runs at the start of every play. It's not free — it shells out to dozens of small commands per host. Cache facts to skip this on subsequent runs:

INI — fact caching
[defaults]
gathering = smart                          # use cache if fresh
fact_caching = jsonfile                    # or 'redis' / 'memcached'
fact_caching_connection = ./.fact_cache
fact_caching_timeout = 7200                # 2 hours

Skipping fact gathering on a 50-host playbook saves ~2 minutes. For large fleets, use fact_caching = redis with a real Redis server so all team members and CI share the same cache.

Default strategy is linear: every host finishes a task before any host moves to the next. This means a single slow host (network glitch, busy system) stalls the whole fleet.

With strategy: free, each host races independently — host 1 can be on task 50 while host 2 is still on task 30. Total wall-clock time drops to the time of the slowest single host's full run, not the sum.

YAML — strategy: free
---
- hosts: webservers
  strategy: free        # don't sync hosts
  tasks:
    - name: Apply patch
      ansible.builtin.dnf:
        name: "*"
        state: latest
    - name: Reboot
      ansible.builtin.reboot:
        reboot_timeout: 600
⚠ Warning: strategy: free doesn't work when later tasks depend on facts from earlier hosts in the same play (gathered at runtime). For most patching/upgrading work it's safe; for cluster orchestration it's not.

Mitogen is a third-party plugin that replaces Ansible's connection layer with a much faster Python-based one. It runs the remote Python interpreter once per host and reuses it across tasks instead of forking + tearing down per task.

BASH — install Mitogen
pip install mitogen

# Find install path
python3 -c "import mitogen, os; print(os.path.dirname(mitogen.__file__))"
# /home/user/.local/lib/python3.9/site-packages/mitogen
INI — enable Mitogen in ansible.cfg
[defaults]
strategy_plugins = /home/user/.local/lib/python3.9/site-packages/ansible_mitogen/plugins/strategy
strategy = mitogen_linear

# Or for free-style + Mitogen
# strategy = mitogen_free
💡 Tip: Mitogen typically gives 3–7× additional speedup on top of pipelining + ControlPersist. The catch: it's not officially supported by Red Hat and may break on Ansible upgrades. Use it on internal tooling, not on regulated production.
INI — production ansible.cfg for fast, large-fleet playbooks
[defaults]
forks = 25
gathering = smart
fact_caching = jsonfile
fact_caching_connection = ./.fact_cache
fact_caching_timeout = 7200
strategy = mitogen_linear            # or 'linear' if Mitogen unavailable
strategy_plugins = ~/.local/lib/python3.9/site-packages/ansible_mitogen/plugins/strategy

# nicer output
stdout_callback = yaml
callbacks_enabled = profile_tasks, timer

[ssh_connection]
pipelining = True
ssh_args = -o ControlMaster=auto -o ControlPersist=300s -o PreferredAuthentications=publickey
control_path_dir = ~/.ansible/cp
BASH — profile a playbook to find the slow tasks
ansible-playbook site.yml

# At the end of the run:
PLAY RECAP ******************************************************
Sunday 03 May 2026  10:14:23 +0000 (0:00:00.012)
=================================================================
Install MySQL packages -------------------------------- 142.31s
mysql_upgrade ------------------------------------------- 89.45s
Configure my.cnf --------------------------------------- 24.73s
...

# The first 3 tasks above account for 70% of total runtime.
# Optimise those, ignore the rest.
✅ Tip: With pipelining + ControlPersist + forks=25 + fact caching, a 30-minute playbook routinely drops to 5–6 minutes. Add Mitogen and you're at 2–3 minutes. The control node CPU becomes the bottleneck before the network does.