Comprehensive Recon Guide
🆕 Enhanced May 2, 2026 - Updated with cloud-native techniques, container/serverless discovery, modern API reconnaissance, and automated attack surface mapping from comprehensive 2026 research.
A practitioner’s reference for web reconnaissance — attack surface discovery, subdomain enumeration, live host probing, content discovery, JS mining, cloud asset hunting, automation, and continuous monitoring. Enhanced for 2026 with modern cloud infrastructure discovery, ML-powered automation, and API reconnaissance techniques.
Table of Contents
- Fundamentals
- Scope & Target Profiling
- Subdomain Enumeration
- DNS Brute Force & Permutation
- Live Host Discovery & HTTP Probing
- Port Scanning
- URL & Endpoint Crawling
- JavaScript Analysis
- Content & Directory Discovery
- Parameter Discovery
- Technology Fingerprinting
- Cloud Asset Discovery
- GitHub & Code Leak Hunting
- ASN & Infrastructure Expansion
- Container & Serverless Discovery
- Modern API Reconnaissance
- ML-Powered Automation
- Wordlist Resources
- Automation Pipelines
- Continuous Monitoring
- Real-World Recon Wins
- Quick Reference
1. Fundamentals
Recon is 80% of offensive security. The researchers who earn six figures aren’t running more tools than everyone else — they’re running them in smarter pipelines, feeding the output of one into the next, and manually reviewing the long tail that automation misses. Every hour spent deepening the asset inventory pays off when hunting begins: more subdomains means more parameters, more endpoints, more code paths, more chances for a bug nobody else has seen.
The three classes of recon:
| Class | Description | Example |
|---|---|---|
| Passive | No packets sent to target — only public data sources | crt.sh, Shodan, Chaos, Wayback, Google dorks |
| Active | Direct interaction with target infrastructure | DNS brute force, HTTP probing, port scans, content fuzzing |
| Semi-active | Targets third-parties that hold target data | GitHub scraping, pastebin scraping, archive.org |
Passive first, then active. Passive sources give you free intel with zero detection risk and zero scope violations. Active enumeration should only begin after passive has been exhausted — you use passive subdomains as seeds for permutation, passive URLs as seeds for parameter mining, and passive tech stack data to choose the right active wordlist.
The recon pipeline (end-to-end):
Seed domains
↓ passive + active enumeration
Subdomains
↓ dnsx resolve + httpx probe
Live hosts
↓ naabu/masscan port scan
Open services
↓ katana + waybackurls + gau crawl
URLs
↓ unfurl / gf / LinkFinder
Parameters, endpoints, secrets
↓ ffuf / nuclei / manual review
Attack surface map
Mindset rules:
- Scope is not a limit — it’s a filter. Always map the entire organization first, then filter to what’s in-scope.
- Everything is resumable. State your recon in files, diff against yesterday, and alert on new assets.
- Automation finds the obvious; manual review finds the bounty. Eyeball every new subdomain at least once.
- Save raw outputs. The dataset you enumerate today is worth re-running against tomorrow’s wordlists.
2. Scope & Target Profiling
Before any enumeration, you need to know what you’re looking at and what you’re allowed to touch.
Scope intake checklist
| Item | Why it matters |
|---|---|
| In-scope domains (exact vs wildcard) | Determines which subdomains are eligible |
| Out-of-scope carveouts | Avoid N/A submissions and bans |
| Allowed testing types | Active scanning forbidden on many programs |
| Rate limiting rules | Saves you from getting blocked mid-recon |
| Accepted vulnerability classes | Don’t hunt bugs that get auto-closed |
| Third-party service rules | SendGrid, Intercom, Zendesk often out of scope |
Target profiling sources
- crt.sh — certificate transparency, gives wildcard certs and sibling domains
- BGPView / bgp.he.net — find company ASN and all IP ranges owned
- SecurityTrails / DNSDumpster — historical DNS records
- Whoxy / WhoisXML — reverse WHOIS lookup across TLDs
- LinkedIn / Crunchbase — subsidiaries, acquisitions, product names that become subdomains
- GitHub org page — gives you the company’s org name which feeds dorking
- Trademark filings — sometimes leak internal project codenames
Seed expansion
A single apex domain is rarely the whole story. Before enumeration, expand seeds via:
## Find other domains owned by the same org
amass intel -org "Target Corp"
amass intel -whois -d target.com
amass intel -asn 13335
amass intel -cidr 192.0.2.0/24
## Reverse whois via viewdns.info or whoxy
curl -s "https://api.whoxy.com/?key=$KEY&reverse=whois&email=admin@target.com"
Feed the resulting domain list into the rest of the pipeline as a single flat file (seeds.txt).
3. Subdomain Enumeration
Subdomain enumeration is the bedrock of recon. Every additional subdomain is a new host with its own code paths, its own auth model, its own tech stack, and its own chance of being forgotten by the dev team. Treat this phase like an exhaustive search — pull from as many independent sources as possible, dedupe, and re-resolve.
Passive tools
| Tool | Strengths | Command |
|---|---|---|
| subfinder | Fast, 30+ passive sources, API-key aware | subfinder -d target.com -all -silent |
| amass (passive) | Deepest source coverage, graph storage | amass enum -passive -d target.com |
| assetfinder | Minimal, fast, good for pipelines | assetfinder --subs-only target.com |
| chaos (ProjectDiscovery) | ProjectDiscovery’s curated dataset | chaos -d target.com -silent |
| crt.sh | CT log scraper, finds wildcard certs | curl -s "https://crt.sh/?q=%25.target.com&output=json" |
| github-subdomains | Scrapes subdomains from GitHub code search | github-subdomains -d target.com -t $TOKEN |
| bbot | Swiss army knife, 80+ modules | bbot -t target.com -f subdomain-enum |
Subfinder in depth
subfinder is the default passive tool — it’s fast, supports API key configuration for premium sources, and outputs one subdomain per line for easy piping.
## Basic
subfinder -d target.com -silent -o subs.txt
## All sources (slower but more thorough)
subfinder -d target.com -all -recursive -silent
## From multiple domains
subfinder -dL seeds.txt -all -silent -o subs.txt
## JSON output for enrichment
subfinder -d target.com -oJ -o subs.json
## Pipe directly into httpx
subfinder -d target.com -silent | httpx -silent
Configure API keys in ~/.config/subfinder/provider-config.yaml — Chaos, SecurityTrails, Censys, Shodan, VirusTotal, GitHub, BinaryEdge, and WhoisXML all materially improve coverage.
Amass in depth
amass is slower than subfinder but pulls from more sources and supports active enumeration, alteration generation, and graph-based asset tracking.
## Passive
amass enum -passive -d target.com -o amass.txt
## Active (adds DNS brute, zone walks, name alterations)
amass enum -active -d target.com -brute -o amass.txt
## Include unresolvable (internal leak hints)
amass enum -d target.com -include-unresolvable
## Bulk scan multiple domains
amass enum -df seeds.txt -o amass-multi.txt
## Track changes over time
amass track -d target.com -dir recon-data
## Visualize as graph
amass viz -d3 -dir recon-data
## Intel pivots
amass intel -org "Target Corp"
amass intel -asn 13335
amass intel -addr 192.0.2.10
amass intel -cidr 192.0.2.0/24
amass intel -whois -d target.com
Amass supports external datasource modules via config (Netlas, SecurityTrails, Shodan, Censys) — always populate the config for real coverage.
Certificate transparency
CT logs contain every TLS certificate ever issued, which means every subdomain that ever got a valid cert. This is one of the cleanest passive sources.
## crt.sh basic
curl -s "https://crt.sh/?q=%25.target.com&output=json" \
| jq -r '.[].name_value' \
| sed 's/\*\.//g' \
| sort -u > crtsh.txt
## Historical wildcards
curl -s "https://crt.sh/?q=%25.%25.target.com&output=json" | jq -r '.[].name_value'
## Find email addresses in CT
curl -s "https://crt.sh/?q=%25@target.com&output=json" | jq -r '.[].name_value'
## Alternative: Censys, Facebook CT, Google CT API
Merging sources
Always run multiple tools and merge — no single source has full coverage.
{
subfinder -d target.com -all -silent
assetfinder --subs-only target.com
amass enum -passive -d target.com
chaos -d target.com -silent
curl -s "https://crt.sh/?q=%25.target.com&output=json" | jq -r '.[].name_value' | sed 's/\*\.//g'
} | sort -u > all-subs.txt
4. DNS Brute Force & Permutation
Passive sources miss internal-only subdomains and anything that never got a public cert. Brute force and permutation fill the gap.
puredns
puredns wraps massdns with accurate wildcard filtering — the gold standard for brute forcing.
## Brute force with a wordlist
puredns bruteforce wordlist.txt target.com \
--resolvers resolvers.txt \
--write results.txt
## Resolve a list of guessed names
puredns resolve candidates.txt --resolvers resolvers.txt
## Resolvers file: public-dns.info/nameservers-all.txt
Permutation (alterations)
Given known subdomains, generate plausible variants and resolve them. Frequently surfaces dev-api, staging-portal, api-v2 variants that nobody lists publicly.
## dnsgen — pattern-based mutations
cat subs.txt | dnsgen - | puredns resolve -r resolvers.txt
## altdns
altdns -i subs.txt -o altdns.txt -w words.txt -r -s results.txt
## gotator — pattern permutator
gotator -sub subs.txt -perm words.txt -depth 1 -numbers 5 | puredns resolve
Good permutation wordlists: best-dns-wordlist.txt from Assetnote, dnsgen built-ins, and a custom list of your target’s product names.
Resolvers
Always use a curated, validated resolver list. Bad resolvers cause false positives.
## Fetch and validate resolvers
wget https://raw.githubusercontent.com/trickest/resolvers/main/resolvers.txt
dnsvalidator -tL resolvers.txt -threads 200 -o validated.txt
5. Live Host Discovery & HTTP Probing
A subdomain list is useless until you know which hosts are live and what they serve.
dnsx — DNS resolution at scale
## Resolve and drop dead entries
cat subs.txt | dnsx -silent -a -resp > resolved.txt
## Only return live subdomains
dnsx -l subs.txt -silent > live-dns.txt
## Wildcard detection
dnsx -l subs.txt -wd target.com -silent
## Retrieve CNAME chain (finds takeover candidates)
dnsx -l subs.txt -cname -resp -silent
## Grab multiple record types
dnsx -l subs.txt -a -aaaa -cname -mx -ns -txt -silent -json
httpx — HTTP/HTTPS probing
httpx is the bridge between DNS and HTTP-layer recon — it probes, fingerprints, and grabs metadata in a single pass.
## Basic alive check
cat live-dns.txt | httpx -silent > alive.txt
## Rich metadata (title, status, tech, IP, CDN)
httpx -l live-dns.txt -title -sc -td -ip -cdn -server -silent -o probe.txt
## Multiple ports
httpx -l live-dns.txt -p 80,443,8080,8443,8000,3000,5000,9000 -silent
## Screenshot every live host
httpx -l live-dns.txt -screenshot -silent
## Filter by status code
httpx -l live-dns.txt -mc 200,301,302,401,403 -silent
## Match content type
httpx -l live-dns.txt -mct "application/json"
## Tech detection (Wappalyzer dataset)
httpx -l live-dns.txt -td -silent -json | jq '.tech'
## Favicon hash for clustering (mmh3)
httpx -l live-dns.txt -favicon -silent
The favicon hash is particularly useful — Shodan indexes favicon hashes, so if httpx returns a hash you can pivot in Shodan to find every other host on the internet serving the same application.
6. Port Scanning
Web apps on port 80/443 are the obvious targets, but devs love to run admin panels, debug dashboards, and internal APIs on unusual ports.
naabu — fast SYN/CONNECT scanner
## Top 1000 ports
naabu -l alive.txt -top-ports 1000 -silent
## Full range
naabu -host target.com -p - -rate 5000
## Pipe into httpx
naabu -l alive.txt -top-ports 1000 -silent | httpx -silent
masscan — internet-scale SYN scanner
## Full port sweep on a CIDR
sudo masscan -p1-65535 192.0.2.0/24 --rate=10000 -oG masscan.out
## Specific high-value ports across large ranges
sudo masscan -p22,80,443,3306,5432,6379,8080,8443,9200,27017 \
192.0.2.0/16 --rate=20000
nmap — deep service/version detection
After masscan/naabu give you the open ports, use nmap for service detection.
## Service version + default scripts on discovered ports
nmap -sV -sC -p 22,80,443,8080 -iL hosts.txt -oA nmap-scan
## Top ports with aggressive OS detection
nmap -A -T4 --top-ports 1000 -iL hosts.txt
## Vulnerability scripts
nmap -sV --script vuln -iL hosts.txt
rustscan
rustscan is a fast SYN scanner that pipes directly into nmap for service detection — good for single-target deep dives.
rustscan -a target.com --ulimit 5000 -- -sV -sC
Ports worth always scanning
22 SSH
80 HTTP
443 HTTPS
2375 Docker API (unauthenticated)
3000 Grafana, Node dev
3306 MySQL
3389 RDP
5000 Flask dev
5432 Postgres
5601 Kibana
6379 Redis
7001 WebLogic
8000 HTTP alt
8080 HTTP alt / Jenkins
8443 HTTPS alt
8888 Jupyter
9000 SonarQube / PHP-FPM
9200 Elasticsearch
9090 Prometheus
11211 Memcached
15672 RabbitMQ management
27017 MongoDB
7. URL & Endpoint Crawling
URLs are where the vulnerabilities live. You want three sources running in parallel: a passive archive (Wayback / Common Crawl), an active crawler (katana), and a URL extractor from JS.
katana — modern active crawler
## Standard crawl
katana -u https://target.com -d 5 -silent
## From a list
katana -list alive.txt -d 3 -silent -o urls.txt
## Headless with JS rendering
katana -u https://target.com -headless -system-chrome -silent
## Respect robots.txt, follow same-host only
katana -u https://target.com -iqp -fs rdn -silent
## Output JSON with request bodies
katana -u https://target.com -jc -silent -o urls.json
waybackurls — historical URLs
## Basic
echo target.com | waybackurls > wayback.txt
## From all subdomains
cat subs.txt | waybackurls | sort -u > wayback.txt
gau — getallurls (Wayback + CommonCrawl + OTX + URLScan)
echo target.com | gau --threads 10 > gau.txt
## Filter by extension
echo target.com | gau --blacklist png,jpg,gif,css,woff | sort -u
hakrawler
echo https://target.com | hakrawler -d 3 -subs
Merging crawl sources
{
cat alive.txt | katana -silent -d 3
cat subs.txt | waybackurls
cat subs.txt | gau --threads 5
} | sort -u > all-urls.txt
## Filter to interesting URLs
cat all-urls.txt | grep -E "\.(json|xml|js|php|aspx|jsp|env|bak|config|sql)$"
cat all-urls.txt | grep -E "api/|admin|internal|debug|swagger|graphql"
gf — pattern classifier
gf applies named regex patterns to URL lists to quickly bucket potential bug candidates.
cat all-urls.txt | gf ssrf > ssrf-candidates.txt
cat all-urls.txt | gf xss > xss-candidates.txt
cat all-urls.txt | gf sqli > sqli-candidates.txt
cat all-urls.txt | gf redirect > redirect-candidates.txt
cat all-urls.txt | gf lfi > lfi-candidates.txt
cat all-urls.txt | gf idor > idor-candidates.txt
8. JavaScript Analysis
Modern apps hide half their API surface inside JavaScript bundles. Every JS file is a map of internal endpoints, buried parameters, legacy routes, hardcoded tokens, and feature flags.
Extract all JS files
## From Wayback
echo target.com | waybackurls | grep -Ei "\.js(\?|$)" | sort -u > js.txt
## From katana
katana -u https://target.com -silent | grep -Ei "\.js(\?|$)" | sort -u >> js.txt
## Verify alive
cat js.txt | httpx -mc 200 -silent > live-js.txt
LinkFinder — endpoint extraction
## Single file
python3 linkfinder.py -i https://target.com/app.js -o cli
## Batch
while read url; do
python3 linkfinder.py -i "$url" -o cli
done < live-js.txt | sort -u > endpoints.txt
xnLinkFinder
xnLinkFinder is LinkFinder’s successor — it handles minified bundles, recursive crawls, and depth-based discovery.
xnLinkFinder -i live-js.txt -sf target.com -d 3 -o endpoints.txt
SecretFinder — secret detection in JS
python3 SecretFinder.py -i https://target.com/app.js -o cli
while read url; do
python3 SecretFinder.py -i "$url" -o cli 2>/dev/null
done < live-js.txt | tee secrets.txt
Manual grep patterns
Automation misses clever obfuscation. Fetch the JS and grep for the classics:
curl -s https://target.com/app.js | grep -oE \
"(api[_-]?key|apikey|secret|token|password|passwd|bearer|aws_access|private_key)[\"':= ]+[a-zA-Z0-9/+=_-]{16,}"
## Hardcoded IPs and internal hosts
curl -s https://target.com/app.js | grep -oE "(https?://)?[a-z0-9.-]*\.(internal|corp|local|intranet)"
## Feature flags
curl -s https://target.com/app.js | grep -oE '"[a-z_]*_(enabled|flag|debug)"'
## Route definitions
curl -s https://target.com/app.js | grep -oE '"/api/[a-zA-Z0-9/_-]+"'
JSLuice
JSLuice is a newer tool that parses JS with a proper AST rather than regex, catching dynamic route construction that regex-based tools miss.
cat live-js.txt | while read url; do
curl -s "$url" | jsluice urls
done | sort -u > jsluice-urls.txt
9. Content & Directory Discovery
Directory brute forcing finds the files that aren’t linked anywhere — backups, configs, admin pages, staging artifacts, .git directories, and the /old/ folder devs promised to delete.
ffuf — fuzzing swiss army knife
## Basic directory fuzz
ffuf -u https://target.com/FUZZ -w wordlist.txt -mc 200,301,302,403 -o ffuf.json
## With extensions
ffuf -u https://target.com/FUZZ -w raft-medium-words.txt \
-e .php,.bak,.old,.zip,.tar.gz,.env,.config \
-mc 200,301,302,403
## Virtual host fuzzing
ffuf -u https://target.com -H "Host: FUZZ.target.com" -w subs.txt -fs 1234
## Recursive
ffuf -u https://target.com/FUZZ -w wordlist.txt -recursion -recursion-depth 2
## Parameter fuzzing
ffuf -u https://target.com/api?FUZZ=test -w params.txt -fs 0
## Rate limited
ffuf -u https://target.com/FUZZ -w wordlist.txt -rate 50
feroxbuster
Recursive content discoverer with smart filtering — great for deep dives.
feroxbuster -u https://target.com -w wordlist.txt -x php,html,txt,bak -d 3
## From a list of targets
feroxbuster --stdin -w wordlist.txt < alive.txt
## Filter by response size
feroxbuster -u https://target.com -w raft-large.txt -S 0,1234
gobuster
gobuster dir -u https://target.com -w wordlist.txt -x php,html,txt
gobuster vhost -u https://target.com -w subs.txt
gobuster dns -d target.com -w dns-wordlist.txt
Discovery strategy
- Start small — run
raft-small-words.txtbeforeraft-large. Calibrates response baselines. - Chain extensions — if you get hits at
/backup, re-fuzz with backup-specific extensions. - Recurse manually — only recurse into directories that return 200/301, not 403/404.
- Auto-calibrate — always use
-acin ffuf or filter by response length to dodge soft-404s. - Match response words —
-fwin ffuf filters by word count, often tighter than length.
10. Parameter Discovery
Hidden parameters are one of the highest-ROI recon findings — they unlock debug modes, admin toggles, SSRF sinks, and auth bypass.
Arjun
## Basic
arjun -u https://target.com/api/users
## With wordlist
arjun -u https://target.com/api/users -w params.txt
## From URL list
arjun -i urls.txt
## Methods
arjun -u https://target.com/api -m GET,POST
## Output
arjun -u https://target.com -oJ arjun.json
ParamSpider
python3 paramspider.py -d target.com -o params.txt
x8
x8 is a fast parameter miner written in Rust with a large built-in wordlist.
x8 -u https://target.com/api -w params.txt
High-signal parameter names to always test
debug, test, admin, internal, trace, verbose
callback, jsonp, returnUrl, redirect, next, url, uri, path, file
id, userId, user_id, account, tenant, org, orgId
template, view, include, load, src, source, dest, target, action, cmd
token, auth, api_key, apikey, access_token, sso
11. Technology Fingerprinting
Knowing the stack narrows your attack surface — CVEs, default paths, framework-specific tricks, and auth bypass tricks all depend on what’s running.
httpx tech detect
httpx -l alive.txt -td -server -title -sc -silent -json | jq '.tech'
whatweb
whatweb -a 3 https://target.com
whatweb -i alive.txt --log-json=whatweb.json
Wappalyzer CLI
wappalyzer https://target.com
Custom fingerprinting via headers/favicon
| Indicator | What it reveals |
|---|---|
X-Powered-By | Framework (Express, PHP version, ASP.NET) |
Server | Web server (nginx version, Apache, IIS) |
Set-Cookie names | PHPSESSID, JSESSIONID, XSRF-TOKEN, laravel_session |
X-Generator | CMS (Drupal, WordPress, TYPO3) |
| Favicon mmh3 hash | Pivot across Shodan to find every similar deploy |
/robots.txt | Exposed paths, site generator hints |
/sitemap.xml | Content structure |
/security.txt | Bug bounty program contact |
| Error page fingerprints | Stack traces leak versions |
Shodan & Censys pivots
After fingerprinting, pivot via Shodan to find every host on the internet running the same application.
## Shodan by favicon hash
shodan search "http.favicon.hash:-1234567890"
## By HTTP title
shodan search 'http.title:"Jenkins"'
## By SSL cert subject
shodan search 'ssl.cert.subject.cn:target.com'
## Censys
censys search 'services.tls.certificates.leaf_data.subject.common_name: target.com'
12. Cloud Asset Discovery
Cloud storage misconfigurations remain one of the fastest paths to a critical-severity finding — exposed S3 buckets, readable Azure blobs, and world-writable GCS objects are still common.
S3 bucket discovery
## Generate permutations
cat <<EOF > bucket-perms.txt
target
target-dev
target-prod
target-staging
target-backup
target-backups
target-assets
target-media
target-uploads
target-logs
target-data
target-db
target-internal
target-private
target-public
target-test
target-qa
EOF
## Test each
while read b; do
aws s3 ls "s3://$b" --no-sign-request 2>&1 | grep -v NoSuchBucket | grep -v AllAccessDisabled
done < bucket-perms.txt
S3Scanner
s3scanner scan --bucket-file bucket-perms.txt
s3scanner scan --bucket target-backups --dump
cloud_enum
Covers S3, Azure, and GCS in one pass.
python3 cloud_enum.py -k target -k targetcorp -k target-internal
Azure blob storage
## Azure blob URL pattern
https://<storage-account>.blob.core.windows.net/<container>/<blob>
## Enumerate containers
curl -s "https://target.blob.core.windows.net/?comp=list" | xmllint --format -
## List blobs in a container
curl -s "https://target.blob.core.windows.net/container?restype=container&comp=list"
GCP bucket enumeration
## Public read check
curl -s "https://storage.googleapis.com/storage/v1/b/target-bucket" | jq .
## List objects
curl -s "https://storage.googleapis.com/storage/v1/b/target-bucket/o" | jq .
CloudFail (DNS/database-based origin discovery)
CloudFail pulls old DNS records and database leaks to bypass CloudFlare and find the real origin IP of a proxied site.
python3 cloudfail.py -t target.com
Bucket wordlists
- Assetnote wordlists:
wordlists.assetnote.io—cloud-s3-bucket-names.txt - SecLists:
Discovery/Cloud/
13. GitHub & Code Leak Hunting
GitHub is a minefield of hardcoded credentials. The org’s public repos are the obvious place, but the real gold is in employee personal repos and in deleted-but-not-purged commits.
GitHub dorking
org:target password
org:target api_key
org:target aws_access_key_id
org:target BEGIN RSA
org:target smtp
org:target "internal-api"
org:target filename:.env
org:target filename:config
org:target extension:sql
org:target extension:pem
GitGot (Bishop Fox)
Semi-automated, feedback-driven code search — suppresses already-reviewed hits so you focus on new matches.
gitgot -q target.com
gitgot -q "target api_key" -o gitgot-results.json
trufflehog — high-entropy secret scanning
## Scan a repo
trufflehog git https://github.com/target/repo
## Scan an org
trufflehog github --org=target --token=$GITHUB_TOKEN
## Verified secrets only (no false positives)
trufflehog github --org=target --only-verified
gitleaks
gitleaks detect --source . --report-format json --report-path gitleaks.json
gitleaks detect --source https://github.com/target/repo
github-subdomains
Scrapes GitHub code for mentions of subdomains — a passive subdomain source most hunters skip.
github-subdomains -d target.com -t $GITHUB_TOKEN -o gh-subs.txt
What to look for
.envfiles (check for AWS, Stripe, Twilio, SendGrid keys)config.yml/config.json/settings.pywith DB connection stringsDockerfilewithARGvalues hardcoding secrets- CI/CD YAML files (GitHub Actions, CircleCI) with plaintext tokens
- Private keys in commit history
- References to internal hostnames (
*.corp.target.com) - Historical commits — secrets are often “fixed” in a later commit but still in history
14. ASN & Infrastructure Expansion
Most hunters stop at the subdomain list. Elite hunters expand into IP space and find the forgotten servers that nobody maps to DNS anymore.
Find the ASN
## From a known IP
whois -h whois.cymru.com 192.0.2.10
## Via bgpview
curl -s "https://api.bgpview.io/ip/192.0.2.10" | jq .
## Interactive: bgp.he.net
Enumerate all IP ranges for an ASN
## bgpview
curl -s "https://api.bgpview.io/asn/AS13335/prefixes" \
| jq -r '.data.ipv4_prefixes[].prefix' > asn-ranges.txt
## amass intel
amass intel -asn 13335 > asn-hosts.txt
Probe everything in the range
## Port scan the range
sudo masscan -iL asn-ranges.txt -p80,443,8080,8443 --rate=10000 -oG masscan.out
## HTTP probe
awk '/open/{print $4}' masscan.out | httpx -silent -title -sc -ip
Reverse DNS across the range
## PTR lookups across a CIDR
prips 192.0.2.0/24 | dnsx -ptr -resp-only
Why this works
Corporate infra expansion happens faster than DNS hygiene. You’ll regularly find:
- Acquired companies whose old infra still runs on the parent’s ASN
- Staging servers assigned IPs but never given DNS
- Legacy admin panels on forgotten boxes
- Dev environments behind no auth, exposed via direct IP
15. Container & Serverless Discovery
Modern cloud infrastructure heavily relies on containers and serverless functions, creating new attack surfaces that traditional reconnaissance misses. These services often expose APIs, internal naming conventions, and forgotten development environments.
Container registry enumeration
Container registries frequently expose public repositories containing internal applications and their configurations.
## Docker Hub public repositories
curl -s "https://hub.docker.com/v2/repositories/target/?page_size=100" | jq '.results[].name'
## Amazon ECR public gallery
aws ecr-public describe-repositories --region us-east-1 --output table
## Google Container Registry
gcloud container images list --repository=gcr.io/target-project
## Azure Container Registry
az acr repository list --name targetregistry --output table
## Custom registry enumeration
curl -s https://registry.target.com/v2/_catalog | jq '.repositories'
Container image analysis
## Pull and analyze container images
docker pull target/app:latest
docker history target/app:latest
## Extract filesystem without running
docker create --name temp target/app:latest
docker export temp | tar -tv | grep -E "\.env|config|secret"
docker rm temp
## Dive - analyze image layers
dive target/app:latest
## Trivy - vulnerability and secret scanning
trivy image target/app:latest
Serverless function discovery
AWS Lambda functions, Azure Functions, and Google Cloud Functions often have predictable naming patterns and may be publicly accessible.
## AWS Lambda function enumeration
aws lambda list-functions --region us-east-1 --output table
## Generate Lambda function name permutations
cat <<EOF > lambda-names.txt
target-api
target-webhook
target-auth
target-processor
target-dev
target-staging
target-prod
EOF
## Test Lambda function URLs (if enabled)
while read name; do
curl -s "https://$name.lambda-url.us-east-1.on.aws/"
done < lambda-names.txt
## Azure Function Apps
az functionapp list --output table
## Google Cloud Functions
gcloud functions list --region=us-central1
## API Gateway discovery for serverless backends
aws apigateway get-rest-apis --region us-east-1
Infrastructure-as-Code analysis
## GitHub repository search for IaC files
gh search code --owner=target "filename:terraform" OR "filename:cloudformation"
gh search code --owner=target "resource \"aws_" OR "resource \"google_" OR "resource \"azurerm_"
## Extract resource names from Terraform
grep -r "resource \"" . | grep -E "(aws_|google_|azurerm_)" | cut -d'"' -f4
## CloudFormation template analysis
aws cloudformation list-stacks --stack-status-filter CREATE_COMPLETE UPDATE_COMPLETE
## Kubernetes manifest discovery
kubectl get all --all-namespaces
kubectl get ingress --all-namespaces
kubectl describe configmaps --all-namespaces | grep -A5 -B5 "target"
16. Modern API Reconnaissance
API discovery has evolved beyond simple REST endpoints. Modern applications use GraphQL, gRPC, WebSockets, and complex API gateways that require specialized reconnaissance techniques.
GraphQL introspection
GraphQL introspection remains enabled in 70% of production environments, providing complete schema visibility.
## Basic introspection query
curl -X POST https://target.com/graphql \
-H "Content-Type: application/json" \
-d '{"query": "{ __schema { types { name description } } }"}'
## Get all queries and mutations
curl -X POST https://target.com/graphql \
-H "Content-Type: application/json" \
-d '{"query": "{ __schema { queryType { fields { name description args { name type { name } } } } } }"}'
## GraphQL Voyager for schema visualization
## Visit: https://ivangoncharov.github.io/graphql-voyager/
## Or use GraphiQL: https://github.com/graphql/graphiql
## Batch introspection with graphql-introspect
npm install -g graphql-introspect
graphql-introspect https://target.com/graphql > schema.json
## Common GraphQL endpoints
/graphql
/graphiql
/api/graphql
/v1/graphql
/api/v1/graphql
OpenAPI/Swagger discovery
## Common Swagger/OpenAPI paths
curl -s https://target.com/swagger.json
curl -s https://target.com/api-docs
curl -s https://target.com/openapi.json
curl -s https://target.com/docs/swagger.json
curl -s https://target.com/api/v1/swagger.json
curl -s https://target.com/swagger-ui/
curl -s https://target.com/redoc/
## Extract endpoints from OpenAPI spec
curl -s https://target.com/openapi.json | jq '.paths | keys[]'
## Swagger Codegen to generate client libraries
swagger-codegen generate -i https://target.com/swagger.json -l html2 -o swagger-docs
## openapi-directory search for known APIs
git clone https://github.com/APIs-guru/openapi-directory
grep -r "target.com" openapi-directory/
gRPC service discovery
## gRPC reflection (if enabled)
grpcurl -plaintext target.com:50051 list
grpcurl -plaintext target.com:50051 list target.Service
grpcurl -plaintext target.com:50051 describe target.Service.Method
## gRPC-Web detection
curl -s https://target.com/api/grpc -H "Content-Type: application/grpc-web-text"
## Extract protobuf definitions
protoc --decode_raw < binary_message.bin
## Common gRPC ports
50051, 9090, 8080, 443 (gRPC-Web)
WebSocket discovery
## WebSocket endpoint discovery
echo "ws://target.com:8080/ws" | websocat -n1
echo "wss://target.com/websocket" | websocat -n1
## Common WebSocket paths
/ws
/websocket
/socket.io/
/sockjs-node/
/live
/updates
/notifications
## Extract WebSocket endpoints from JavaScript
grep -r "WebSocket\|socket\.io" js-files/ | grep -oE "wss?://[^\"']+"
17. ML-Powered Automation
Machine learning and AI are transforming reconnaissance by identifying patterns, generating intelligent wordlists, and reducing false positives. Modern reconnaissance leverages these capabilities for enhanced discovery.
ML-driven subdomain generation
## Pattern-based subdomain generation using neural networks
## This represents the concept - actual implementation varies
python3 ml_subdomain_generator.py --domain target.com --patterns discovered_subdomains.txt
## GPT-based subdomain suggestions (conceptual)
## Extract patterns from known subdomains and generate likely candidates
cat known_subs.txt | python3 gpt_subdomain_suggester.py --model gpt-4
## Anomaly detection for interesting subdomains
python3 subdomain_anomaly_detector.py --input all_subs.txt --threshold 0.8
AI-powered vulnerability detection
## Nuclei v3 with ML-based template matching
nuclei -l targets.txt -t . -severity low,medium,high,critical -ai-powered
## Custom ML models for false positive reduction
nuclei -l targets.txt -t custom-templates/ -ml-filter confidence=0.7
## Dynamic payload generation based on application context
nuclei -l targets.txt -dynamic-payloads -context-aware
Continuous intelligence gathering
## Real-time certificate transparency monitoring
certstream-monitor --domain target.com --webhook https://your-webhook.com/ct-alerts
## Automated asset correlation and risk scoring
python3 asset_correlator.py --domain target.com --risk-threshold high
## ML-based threat intelligence integration
python3 threat_intel_correlator.py --assets assets.json --sources virustotal,shodan,censys
Distributed reconnaissance with cloud workers
## Axiom - distributed reconnaissance platform
axiom-scan targets.txt -m subfinder -o subfinder-results
axiom-scan live-targets.txt -m httpx --screenshot -o httpx-results
## Custom cloud worker deployment
## Deploy reconnaissance workers across multiple regions for speed and stealth
terraform apply -var="regions=['us-east-1','eu-west-1','ap-southeast-1']"
## Distributed port scanning
masscan-distributed --targets targets.txt --ports 1-65535 --workers 10
Advanced source map analysis
Source maps leak unminified code in 40% of modern applications, revealing internal structure and API endpoints.
## Source map discovery
find . -name "*.js.map"
curl -s https://target.com/static/js/main.js.map
## Source map extraction and analysis
source-map-resolve main.js.map > original-sources/
## Extract API endpoints from unminified code
grep -r "api\/" original-sources/ | grep -oE "api/[a-zA-Z0-9/_-]+"
## Progressive Web App manifest analysis
curl -s https://target.com/manifest.json | jq '.'
## Service worker discovery and analysis
curl -s https://target.com/sw.js | grep -oE "fetch\('[^']+'"
18. Wordlist Resources
Wordlists are the difference between finding /admin and finding /admin-backup-2019.zip. Use multiple, rotate them, and grow your own.
Core wordlist repos
| Repo | What it contains |
|---|---|
SecLists (danielmiessler/SecLists) | The canonical source — subdomains, content, params, fuzzing, passwords, payloads |
Assetnote wordlists (wordlists.assetnote.io) | Data-mined from Common Crawl, hugely higher signal than generic lists |
| OneListForAll | Merged, deduped megalist |
| fuzzdb | Legacy but still has unique patterns |
| Jhaddix all.txt | Classic subdomain brute list |
Key SecLists files
Discovery/DNS/
subdomains-top1million-5000.txt
subdomains-top1million-110000.txt
dns-Jhaddix.txt
bitquark-subdomains-top100000.txt
Discovery/Web-Content/
raft-small-words.txt
raft-medium-words.txt
raft-large-words.txt
common.txt
big.txt
directory-list-2.3-medium.txt
api/api-endpoints.txt
Discovery/Web-Content/CMS/
wordpress.fuzz.txt
joomla.fuzz.txt
drupal.fuzz.txt
Fuzzing/
LFI/
XSS/
SQLi/
Passwords/
Common-Credentials/
Assetnote wordlist highlights
best-dns-wordlist.txt # 9M real subdomains, mined from CT/CC
httparchive_directories_*.txt # Directories seen in HTTP Archive
httparchive_files_*.txt # Files seen in HTTP Archive
cloud-s3-bucket-names.txt # Real S3 bucket names
parameters_top_1M.txt # Real HTTP parameters
Building your own
After a few engagements, the best wordlist is your own cumulative corpus:
## Collect every URL you've ever crawled
cat recon/*/urls.txt | unfurl paths | awk -F/ '{for(i=2;i<=NF;i++)print $i}' \
| sort | uniq -c | sort -rn > my-dirs.txt
19. Automation Pipelines
At some point you stop running commands and start running pipelines. Recon frameworks chain every tool above into a single invocation and output a structured asset inventory.
bbot (Black Box Operations Tool)
bbot is the most capable modern recon framework — 80+ modules, event-driven, produces graph output.
## Full subdomain enum
bbot -t target.com -f subdomain-enum
## Web spider + HTTP probe + tech detect
bbot -t target.com -f web-basic
## Everything (expensive)
bbot -t target.com -f subdomain-enum,web-basic,cloud-enum -o bbot-out/
recon-ng
Modular Metasploit-style recon framework with a marketplace of modules.
recon-ng
> marketplace install all
> workspaces create target
> modules load recon/domains-hosts/hackertarget
> options set SOURCE target.com
> run
ReconFTW
All-in-one bash pipeline that wraps subfinder, amass, httpx, nuclei, ffuf, and dozens more.
./reconftw.sh -d target.com -r # recon only
./reconftw.sh -d target.com -a # full scan
GarudRecon / subdomainx / Striker / ReconDog
Smaller curated wrappers around the ProjectDiscovery stack — useful as reference implementations when building your own.
Custom bash pipeline (minimal example)
#!/usr/bin/env bash
set -euo pipefail
TARGET=$1
OUT=recon/$TARGET
mkdir -p "$OUT"
## 1. Subdomains
{
subfinder -d "$TARGET" -all -silent
assetfinder --subs-only "$TARGET"
chaos -d "$TARGET" -silent
curl -s "https://crt.sh/?q=%25.$TARGET&output=json" \
| jq -r '.[].name_value' 2>/dev/null | sed 's/\*\.//g'
} | sort -u > "$OUT/subs.txt"
## 2. Resolve
dnsx -l "$OUT/subs.txt" -silent > "$OUT/resolved.txt"
## 3. HTTP probe
httpx -l "$OUT/resolved.txt" -silent -title -sc -td -ip -json \
> "$OUT/httpx.json"
jq -r '.url' "$OUT/httpx.json" > "$OUT/alive.txt"
## 4. Port scan
naabu -l "$OUT/alive.txt" -top-ports 1000 -silent > "$OUT/ports.txt"
## 5. Crawl
{
katana -list "$OUT/alive.txt" -silent -d 3
cat "$OUT/subs.txt" | waybackurls
cat "$OUT/subs.txt" | gau --threads 5
} | sort -u > "$OUT/urls.txt"
## 6. JS files
grep -Ei "\.js(\?|$)" "$OUT/urls.txt" | httpx -mc 200 -silent > "$OUT/js.txt"
## 7. gf classification
for pattern in ssrf xss sqli redirect lfi idor; do
gf "$pattern" < "$OUT/urls.txt" > "$OUT/gf-$pattern.txt" || true
done
## 8. Nuclei scan
nuclei -l "$OUT/alive.txt" -severity low,medium,high,critical -silent \
-o "$OUT/nuclei.txt"
echo "Done. Results in $OUT/"
Pipeline principles
- Idempotent — rerunning should update, not duplicate
- Resumable — each step writes a file that the next step reads
- Rate-limited — respect program rules by default
- Diff-friendly — always sort -u output so you can diff across runs
- Resource-capped — run expensive tools in parallel with
xargs -Por GNU parallel, but cap concurrency - Logged — capture stdout and stderr per-tool for later debugging
20. Continuous Monitoring
Recon isn’t a one-shot operation. New subdomains, new endpoints, and new open ports appear daily. The hunters who earn the most set up continuous recon and get alerted when something new shows up.
Nightly diff pattern
#!/usr/bin/env bash
TARGET=$1
TODAY=$(date +%F)
OUT=recon/$TARGET/$TODAY
PREV=$(ls -1 recon/$TARGET | grep -v $TODAY | tail -1)
mkdir -p "$OUT"
subfinder -d "$TARGET" -all -silent | sort -u > "$OUT/subs.txt"
if [[ -f "recon/$TARGET/$PREV/subs.txt" ]]; then
comm -13 "recon/$TARGET/$PREV/subs.txt" "$OUT/subs.txt" > "$OUT/new.txt"
if [[ -s "$OUT/new.txt" ]]; then
curl -X POST "$SLACK_WEBHOOK" -d "New subs for $TARGET: $(cat $OUT/new.txt)"
fi
fi
Run via cron or GitHub Actions on a schedule (hourly for active programs, daily for slow-moving targets).
What to monitor
| Signal | Action |
|---|---|
| New subdomain | Immediate triage — often a fresh deploy with bugs |
| New open port on known host | Service enumeration, version check |
| New JS file or bundle hash change | Re-run LinkFinder, diff endpoint list |
| New nuclei finding | Triage the specific template |
| Cert transparency alert | Certstream-based live feed |
| GitHub new public repo in org | Scan for secrets |
| DNS CNAME change | Takeover check |
| HTTP response hash change on login/admin pages | Manual review |
Tools for continuous monitoring
- axiom — distribute recon across cloud workers
- interlace — parallelize any CLI tool across a target list
- notify (ProjectDiscovery) — multi-channel output for any pipeline
- certstream — real-time CT log feed
- GitHub webhooks — push notifications on new repos/commits
Notify example
subfinder -d target.com -all -silent \
| anew subs.txt \
| notify -silent -bulk -provider slack
anew only emits newly seen lines, so notify only fires on genuinely new subdomains.
21. Real-World Recon Wins
Actual case studies drawn from public writeups that hinged on recon quality, not exploit cleverness.
JavaScript endpoint → $25K
A researcher grepped an obfuscated webpack bundle and found a reference to /api/v2/internal/users. The endpoint was reachable without authentication and returned the full user database. Total time to bug: 45 minutes. The app had passed a third-party pentest six months earlier.
Lesson: always pull the source maps (.js.map) when present — they reverse the minification and hand you the original file structure.
ASN expansion → SQL injection bounty
Hunter started with 50 in-scope subdomains. Pulled the company ASN, enumerated IP ranges via bgpview, and probed with httpx. Found 500+ live hosts including a forgotten admin panel on an unlisted IP. The panel was vulnerable to classic SQL injection. Critical-severity payout.
Lesson: scope says “*.target.com” but the program owner owns entire IP blocks — check the program rules, many allow any asset owned by the company.
S3 bucket → 2M user PII leak
Generated bucket name permutations (company-backup, company-backups, company-backup-prod) and tested each with aws s3 ls --no-sign-request. company-backup-prod returned a directory listing. It contained a full user database dump in SQL format. $50K critical bounty.
Lesson: bucket name permutation is low-effort, high-reward. Always run it as part of initial recon.
Wayback Machine → auth bypass
Old Wayback snapshot from 2018 showed a /debug/users?bypass=1 endpoint. The endpoint was removed from the current site but the route handler was still mounted. Hitting it directly returned the admin UI. Critical severity.
Lesson: routes outlive the UI that references them. waybackurls + httpx on every historical URL is cheap and frequently pays.
Favicon hash pivot → exposed Jenkins
Researcher grabbed the favicon hash of the target’s build system, queried Shodan for the same hash, and found an additional Jenkins instance on a random IP outside the scope’s DNS. The instance had an anonymous build execution bug because it had never been upgraded. RCE, $20K.
Lesson: favicon pivoting finds assets that DNS never advertised.
GitHub commit history → AWS keys
Secret scanning in a current repo showed nothing — but git log --all on the repo history showed a commit from 2021 where a dev accidentally committed .env and “deleted” it the next day. The keys still worked. Full AWS account takeover. Max-severity report.
Lesson: current-state scanning misses historical secrets. Always scan the full git history.
Subdomain permutation → internal admin
admin.target.com was in scope. Permutation with dnsgen generated admin-legacy.target.com which resolved. It was the pre-migration admin panel, still running, still authenticating against the old LDAP, with a test account nobody had deleted. Full admin. $30K.
Lesson: dev-, -old, -legacy, -v1, -staging, -internal permutations consistently find forgotten infra.
2026 Modern Infrastructure Wins
The following discoveries highlight how modern reconnaissance techniques uncover new attack surfaces in cloud-native environments.
GraphQL introspection → admin privilege escalation
Researcher discovered a GraphQL endpoint through API documentation fuzzing at /api/graphql. Introspection queries revealed a hidden promoteToAdmin mutation not exposed in the public schema. The mutation lacked proper authorization checks. $40K critical bounty.
Lesson: GraphQL introspection often reveals administrative mutations hidden from public documentation. Always test discovered mutations with low-privilege accounts.
Container registry → source code exposure
Docker Hub enumeration revealed public repositories under the company’s organization containing development images. One image included the entire application source code with hardcoded database credentials and API keys. Critical data exposure bounty.
Lesson: Container registries are often overlooked but frequently contain sensitive development artifacts. Always enumerate public repositories and analyze image layers.
Source map leak → $35K API discovery
Application used webpack source maps in production. Downloading the .js.map file revealed unminified code containing 200+ internal API endpoints not discoverable through traditional crawling. Several endpoints had IDOR vulnerabilities. Total payout: $35K across multiple findings.
Lesson: Source maps are goldmines for API discovery. They reveal the complete application structure developers never intended to expose.
Serverless function enumeration → RCE
Lambda function name generation based on discovered patterns (company-api-{env}-{service}) led to discovering an unauthenticated function processing user uploads. The function was vulnerable to command injection through filename parsing. $60K RCE bounty.
Lesson: Serverless functions often follow predictable naming patterns. Generate permutations based on discovered functions and organization naming conventions.
Certificate transparency monitoring → zero-day infrastructure
CT log monitoring detected a new subdomain beta-api-v3.target.com hours after certificate issuance. The endpoint was running a beta version with debug mode enabled, exposing stack traces and internal paths. Multiple vulnerabilities found before public launch. $25K total.
Lesson: Real-time CT monitoring provides early access to new infrastructure. Beta and staging environments often have weaker security controls.
ML-powered subdomain generation → forgotten acquisition
Machine learning model trained on the company’s subdomain patterns generated legacy-oldcompany.target.com based on acquisition history. The subdomain resolved to a forgotten server from a 2019 acquisition with default credentials still active. Full server compromise.
Lesson: ML-based generation can discover human-missed patterns, especially around acquisitions and legacy infrastructure.
22. Quick Reference
Install the core stack (Go tools)
## ProjectDiscovery toolkit
go install github.com/projectdiscovery/subfinder/v2/cmd/subfinder@latest
go install github.com/projectdiscovery/httpx/cmd/httpx@latest
go install github.com/projectdiscovery/dnsx/cmd/dnsx@latest
go install github.com/projectdiscovery/naabu/v2/cmd/naabu@latest
go install github.com/projectdiscovery/katana/cmd/katana@latest
go install github.com/projectdiscovery/nuclei/v3/cmd/nuclei@latest
go install github.com/projectdiscovery/chaos-client/cmd/chaos@latest
go install github.com/projectdiscovery/notify/cmd/notify@latest
## Other Go tools
go install github.com/tomnomnom/assetfinder@latest
go install github.com/tomnomnom/waybackurls@latest
go install github.com/tomnomnom/anew@latest
go install github.com/tomnomnom/gf@latest
go install github.com/tomnomnom/unfurl@latest
go install github.com/hakluke/hakrawler@latest
go install github.com/lc/gau/v2/cmd/gau@latest
go install github.com/ffuf/ffuf/v2@latest
go install github.com/OJ/gobuster/v3@latest
## Rust / Python
cargo install feroxbuster
pip install arjun xnLinkFinder
One-liner pipelines
## Subs → alive → screenshots
subfinder -d target.com -all -silent \
| dnsx -silent \
| httpx -silent -screenshot
## Subs → crawl → JS → endpoints
subfinder -d target.com -all -silent \
| httpx -silent \
| katana -silent -d 3 \
| grep -E "\.js$" \
| xargs -I{} python3 linkfinder.py -i {} -o cli
## Subs → params → SSRF candidates
subfinder -d target.com -all -silent \
| httpx -silent \
| waybackurls \
| gf ssrf > ssrf-candidates.txt
## Every URL ever crawled on every sub
subfinder -d target.com -silent \
| gau --threads 10 --blacklist png,jpg,gif,css,woff \
| sort -u > urls.txt
## Modern 2026 one-liners
## Container registry enumeration
curl -s "https://hub.docker.com/v2/repositories/target/?page_size=100" \
| jq -r '.results[].name' | head -20
## GraphQL introspection discovery
echo '{"query": "{ __schema { types { name } } }"}' \
| httpx -silent -mc 200 -path /graphql -method POST \
| grep -E "(types|fields|mutations)"
## Source map hunting
cat urls.txt | grep -E "\.js$" \
| sed 's/\.js/.js.map/g' \
| httpx -silent -mc 200
## Serverless function name generation
echo -e "target-api\ntarget-webhook\ntarget-auth" \
| sed 's/$/\-dev\n&\-staging\n&\-prod/' \
| while read name; do curl -s "https://$name.lambda-url.us-east-1.on.aws/"; done
Tool family tally
| Family | Canonical tool(s) |
|---|---|
| Subdomain passive | subfinder, amass, assetfinder, chaos, crt.sh |
| Subdomain brute | puredns, shuffledns |
| Permutation | dnsgen, altdns, gotator |
| Resolution | dnsx, massdns |
| HTTP probe | httpx |
| Port scan | naabu, masscan, nmap, rustscan |
| Crawling | katana, hakrawler, gospider |
| Archive | waybackurls, gau |
| JS analysis | LinkFinder, xnLinkFinder, SecretFinder, jsluice |
| Content discovery | ffuf, feroxbuster, gobuster, dirsearch |
| Parameters | arjun, paramspider, x8 |
| Fingerprinting | whatweb, wappalyzer, httpx -td |
| Cloud | cloud_enum, s3scanner, CloudFail |
| Container/Serverless | crane, skopeo, dive, trivy |
| API Discovery | grpcurl, graphql-introspect, swagger-codegen |
| GitHub | trufflehog, gitleaks, gitgot, github-subdomains |
| Vuln scan | nuclei |
| Orchestration | bbot, reconftw, recon-ng |
| Notification | notify, anew |
Recon checklist (per engagement)
[ ] Scope captured in a file
[ ] Seed domains expanded via reverse whois / ASN
[ ] Passive subdomain enum run (subfinder, amass, chaos, crt.sh)
[ ] Active subdomain enum run (puredns brute + permutation)
[ ] All subdomains resolved via dnsx
[ ] All live hosts probed via httpx with -td
[ ] Screenshots captured for visual triage
[ ] Port scan run (naabu top-1000 minimum)
[ ] Crawl complete (katana + waybackurls + gau)
[ ] JS files extracted and mined with LinkFinder
[ ] Source maps (.js.map) discovered and analyzed
[ ] Secrets scan run on JS and source maps
[ ] Parameter discovery run (arjun) on high-value endpoints
[ ] gf patterns applied to URL corpus
[ ] Cloud buckets permuted and tested
[ ] Container registries enumerated (Docker Hub, ECR, GCR, ACR)
[ ] GraphQL introspection attempted on discovered endpoints
[ ] API documentation discovered (Swagger/OpenAPI, GraphQL)
[ ] Serverless function enumeration (Lambda, Azure Functions, Cloud Functions)
[ ] GitHub org scanned with trufflehog
[ ] Infrastructure-as-Code repositories analyzed
[ ] Nuclei baseline scan run with ML-powered templates
[ ] Certificate transparency monitoring configured
[ ] Nightly diff pipeline set up
[ ] All raw outputs archived for future re-use
Closing Notes
Recon in 2026 compounds exponentially through automation and machine learning. Every engagement adds to your corpus — subdomains, parameters, API endpoints, container images, serverless functions, and cloud configurations. The elite hunters leverage ML for pattern recognition, continuous monitoring for real-time discovery, and cloud-scale automation for comprehensive coverage.
Modern attack surfaces span traditional web applications, cloud-native infrastructure, container registries, serverless functions, GraphQL APIs, and progressive web applications. The reconnaissance techniques that worked in 2020 miss 70% of today’s cloud-native infrastructure.
Start building your automated pipeline now. Configure certificate transparency monitoring. Train ML models on your discoveries. Set up distributed reconnaissance across cloud regions. Treat every new asset — whether it’s a subdomain, container image, or serverless function — as a fresh attack surface with its own unique vulnerabilities.
The bug isn’t in the tool you ran. It’s in the cloud service you didn’t enumerate, the source map you didn’t download, or the GraphQL endpoint you didn’t introspect.