With cloud fleets, IPs and host counts change daily. A static hosts.ini goes stale
the moment auto-scaling kicks in. Dynamic inventory queries the truth source
(AWS, Azure, vSphere, Kubernetes) and builds the host list fresh on every run.
Install the AWS collection, then drop a config file telling the plugin what to query:
# Install the collection
ansible-galaxy collection install amazon.aws
# Install the Python deps
pip install boto3 botocore
# Set credentials (AWS CLI conventions)
export AWS_ACCESS_KEY_ID="..."
export AWS_SECRET_ACCESS_KEY="..."
export AWS_REGION="ap-south-1"
---
# The filename MUST end with .aws_ec2.yml or .aws_ec2.yaml
# OR include the plugin: line as below
plugin: amazon.aws.aws_ec2
regions:
- ap-south-1
- us-east-1
# Filter to live instances only
filters:
instance-state-name: running
tag:Environment: production
# Build groups from tags
keyed_groups:
- prefix: tag
key: tags.Role # creates tag_Role_db, tag_Role_web, ...
- prefix: az
key: placement.availability_zone
# How to address each host (use private IP since we're in a VPC)
hostnames:
- private-ip-address
# Compose extra vars from instance attributes
compose:
ansible_host: private_ip_address
instance_type: instance_type
ec2_az: placement.availability_zone
# Same -i flag, just point at the YAML config
ansible-playbook site.yml -i inventories/aws_ec2.yml
# Verify what came back
ansible-inventory -i inventories/aws_ec2.yml --graph
# Sample output
# @all:
# |--@aws_ec2:
# | |--ip-10-0-1-12.ap-south-1.compute.internal
# | |--ip-10-0-1-15.ap-south-1.compute.internal
# |--@tag_Role_db:
# | |--ip-10-0-1-12.ap-south-1.compute.internal
group_vars/tag_Role_db.yml — Ansible auto-applies that file to every instance tagged Role=db. No manual mapping needed.| Provider | Plugin | Collection |
|---|---|---|
| AWS EC2 | amazon.aws.aws_ec2 | amazon.aws |
| Azure | azure.azcollection.azure_rm | azure.azcollection |
| GCP | google.cloud.gcp_compute | google.cloud |
| OpenStack | openstack.cloud.openstack | openstack.cloud |
| vSphere | community.vmware.vmware_vm_inventory | community.vmware |
| Kubernetes | kubernetes.core.k8s | kubernetes.core |
| DigitalOcean | community.digitalocean.digitalocean | community.digitalocean |
Before plugins existed (Ansible 2.4-), people wrote shell or Python scripts that printed
JSON. Scripts still work — point -i at any executable file and Ansible runs it
expecting JSON output:
#!/usr/bin/env python3
"""
Reads our internal CMDB and prints inventory JSON Ansible can consume.
Make this file executable: chmod +x from_db.py
"""
import json, sys
import requests
resp = requests.get("https://cmdb.example.com/api/hosts").json()
inv = {{
"_meta": {{"hostvars": {{}}}}
}}
for h in resp["hosts"]:
role = h["role"]
if role not in inv:
inv[role] = {{"hosts": []}}
inv[role]["hosts"].append(h["fqdn"])
inv["_meta"]["hostvars"][h["fqdn"]] = {{
"ansible_host": h["ip"],
"datacenter": h["dc"],
}}
# Required: --list for full inventory, --host <name> for one host's vars
if "--list" in sys.argv:
print(json.dumps(inv))
elif "--host" in sys.argv:
name = sys.argv[sys.argv.index("--host") + 1]
print(json.dumps(inv["_meta"]["hostvars"].get(name, {{}})))
group_vars properly, and they don't fork a subprocess on every run. Use scripts only when no plugin exists for your source.
Real projects often have a static bastion host plus dynamically discovered managed
nodes. Point -i at a directory containing both — Ansible reads them all and merges:
inventories/production/
├── hosts.ini # static — bastions, jumphosts, fixed nodes
├── aws_ec2.yml # dynamic — production EC2 fleet
├── group_vars/
│ ├── all.yml
│ ├── tag_Role_db.yml # applies to dynamic EC2 group
│ └── bastion.yml # applies to static group
└── host_vars/
ansible-playbook site.yml -i inventories/production/
# → loads BOTH static hosts AND EC2 instances, with merged groups
Dynamic queries are slow — every ansible-playbook hits the cloud API. Enable
caching to reuse the same query for a configurable time window:
[defaults]
inventory = ./inventories/production/
[inventory]
cache = True
cache_plugin = jsonfile
cache_timeout = 3600
cache_connection = /tmp/ansible_inventory_cache
--flush-cache at the CLI bypasses cache when you need a fresh look.