Secure Coding Deep Dive — Multi-Language Reference
Secure Coding Deep Dive — Multi-Language Reference
CIPHER Training Module | Sources: OWASP Cheat Sheet Series, rust-secure-code/safety-dance, oss-fuzz Last updated: 2026-03-14
Table of Contents
- SQL Injection Prevention
- Cross-Site Scripting (XSS) Prevention
- OS Command Injection Defense
- Deserialization Safeguards
- Server-Side Request Forgery (SSRF) Prevention
- File Upload Security
- XML External Entity (XXE) Prevention
- Server-Side Template Injection (SSTI)
- Mass Assignment Prevention
- Unvalidated Redirects
- Error Handling & Secure Logging
- Input Validation Fundamentals
- Memory Safety — C/C++ and Rust
- Fuzzing Infrastructure
1. SQL Injection Prevention
Core Principle
The database must always distinguish between code and data. Parameterized queries enforce this separation at the driver level — user input can never alter query structure.
Parameterized Query Examples by Language
Java (JDBC)
String custname = request.getParameter("customerName");
String query = "SELECT account_balance FROM user_data WHERE user_name = ?";
PreparedStatement pstmt = connection.prepareStatement(query);
pstmt.setString(1, custname);
ResultSet results = pstmt.executeQuery();
Java (Hibernate HQL)
Query safeHQLQuery = session.createQuery(
"from Inventory where productID = :productid");
safeHQLQuery.setParameter("productid", userSuppliedParameter);
Java (Stored Procedures)
String custname = request.getParameter("customerName");
CallableStatement cs = connection.prepareCall("{call sp_getAccountBalance(?)}");
cs.setString(1, custname);
ResultSet results = cs.executeQuery();
Python (DB-API 2.0 — sqlite3/psycopg2/mysql-connector)
cursor.execute(
"SELECT account_balance FROM user_data WHERE user_name = %s",
(custname,)
)
For sqlite3:
cursor.execute(
"SELECT account_balance FROM user_data WHERE user_name = ?",
(custname,)
)
Python (SQLAlchemy ORM)
from sqlalchemy import text
stmt = text("SELECT * FROM users WHERE name = :name")
result = conn.execute(stmt, {"name": user_input})
C# / .NET
string sql = "SELECT * FROM Customers WHERE CustomerId = @CustomerId";
SqlCommand command = new SqlCommand(sql);
command.Parameters.Add(new SqlParameter("@CustomerId", SqlDbType.Int));
command.Parameters["@CustomerId"].Value = 1;
C# (OleDb)
String query = "SELECT account_balance FROM user_data WHERE user_name = ?";
OleDbCommand command = new OleDbCommand(query, connection);
command.Parameters.Add(new OleDbParameter("customerName", CustomerName.Text));
OleDbDataReader reader = command.ExecuteReader();
VB .NET (Stored Procedures)
Dim command As SqlCommand = new SqlCommand("sp_getAccountBalance", connection)
command.CommandType = CommandType.StoredProcedure
command.Parameters.Add(new SqlParameter("@CustomerName", CustomerName.Text))
Dim reader As SqlDataReader = command.ExecuteReader()
Ruby (ActiveRecord)
Project.where("name = :name", name: user_input)
Ruby (Built-in)
insert_new_user = db.prepare "INSERT INTO users (name, age, gender) VALUES (?, ?, ?)"
insert_new_user.execute 'aizatto', '20', 'male'
PHP (PDO)
$stmt = $dbh->prepare("INSERT INTO REGISTRY (name, value) VALUES (:name, :value)");
$stmt->bindParam(':name', $name);
$stmt->bindParam(':value', $value);
Perl (DBI)
my $sql = "INSERT INTO foo (bar, baz) VALUES (?, ?)";
my $sth = $dbh->prepare($sql);
$sth->execute($bar, $baz);
Rust (SQLx)
// Compile-time checked query
let row = sqlx::query!("SELECT id, name FROM users WHERE name = $1", user_input)
.fetch_one(&pool)
.await?;
// Runtime binding
sqlx::query("INSERT INTO users (name) VALUES ($1)")
.bind(user_input)
.execute(&pool)
.await?;
Go (database/sql)
rows, err := db.Query("SELECT * FROM users WHERE name = $1", userInput)
// or for MySQL:
rows, err := db.Query("SELECT * FROM users WHERE name = ?", userInput)
TypeScript/JavaScript (node-postgres)
const result = await pool.query(
'SELECT * FROM users WHERE name = $1',
[userInput]
);
TypeScript (Prisma)
const user = await prisma.user.findMany({
where: { name: userInput }, // automatically parameterized
});
Safe Table/Column Name Handling
Parameterized queries cannot bind identifiers (table names, column names). Use allowlist mapping:
String tableName;
switch (PARAM) {
case "Value1": tableName = "fooTable"; break;
case "Value2": tableName = "barTable"; break;
default: throw new InputValidationException("unexpected value");
}
ALLOWED_SORT_COLUMNS = {"name", "created_at", "email"}
if sort_column not in ALLOWED_SORT_COLUMNS:
raise ValueError(f"Invalid sort column: {sort_column}")
Database Hardening
- Grant only required access (read/write/execute) to specific tables
- Never assign DBA or admin access to application accounts
- Use separate database users per application
- Revoke CREATE/DROP privileges from application accounts
- Do not run the DBMS as root/SYSTEM
- Use views to limit field-level access
2. Cross-Site Scripting (XSS) Prevention
Output Encoding by Context
Every context where untrusted data renders requires a different encoding scheme. Applying the wrong encoding for the context = vulnerability.
| Context | Encoding | Example Transform |
|---|---|---|
| HTML Body | HTML Entity | <script> -> <script> |
| HTML Attribute | Hex Entity (&#xHH;) |
" -> " |
| JavaScript (quoted string) | Unicode (\uXXXX) |
' -> \u0027 |
| CSS Property Value | CSS Hex (\XX or \XXXXXX) |
( -> \28 |
| URL Parameter | Percent (%HH) |
-> %20 |
HTML Entity Encoding Characters
| Character | Encoding |
|---|---|
& |
& |
< |
< |
> |
> |
" |
" |
' |
' |
Output Encoding Libraries by Language
Python
import html
safe = html.escape(user_input) # HTML context
from markupsafe import Markup, escape
safe = escape(user_input) # Jinja2/Flask auto-escapes with this
from urllib.parse import quote
safe_url = quote(user_input) # URL context
Java
// OWASP Java Encoder (recommended)
import org.owasp.encoder.Encode;
String safeHtml = Encode.forHtml(userInput);
String safeAttr = Encode.forHtmlAttribute(userInput);
String safeJs = Encode.forJavaScript(userInput);
String safeCss = Encode.forCssString(userInput);
String safeUrl = Encode.forUriComponent(userInput);
JavaScript / TypeScript
// DOM — safe sink (auto-encodes)
element.textContent = untrustedData;
// HTML sanitization when HTML is required
import DOMPurify from 'dompurify';
let clean = DOMPurify.sanitize(dirty);
// URL encoding
encodeURIComponent(userInput);
Go
import "html/template" // Auto-escapes in templates
import "html"
safe := html.EscapeString(userInput)
import "net/url"
safe := url.QueryEscape(userInput)
C# / .NET
using System.Web;
string safeHtml = HttpUtility.HtmlEncode(userInput);
string safeUrl = HttpUtility.UrlEncode(userInput);
// ASP.NET Core Razor — auto-escapes by default
// Use @Html.Raw() ONLY for pre-sanitized content
Rust
// askama template engine — auto-escapes by default
// ammonia crate for HTML sanitization
use ammonia::clean;
let safe = clean(user_input);
Framework Auto-Escaping
Most modern frameworks auto-escape in templates. Know the escape hatches that bypass this protection:
| Framework | Auto-Escapes | Bypass (DANGEROUS) |
|---|---|---|
| React/JSX | Yes (string interpolation) | dangerouslySetInnerHTML |
| Angular | Yes | bypassSecurityTrustAs* |
| Vue | Yes ({{ }}) |
v-html directive |
| Django | Yes | {{ var|safe }}, {% autoescape off %} |
| Jinja2 | Yes (when enabled) | {{ var|safe }}, Markup() |
| Rails ERB | Yes (<%= %>) |
raw(), html_safe |
| Go html/template | Yes | template.HTML() type cast |
| ASP.NET Razor | Yes | @Html.Raw() |
| Lit | Yes | unsafeHTML |
DOM-Based XSS Prevention
Never use with untrusted data:
innerHTML,outerHTMLdocument.write(),document.writeln()eval(),Function(),setTimeout(string),setInterval(string)
Safe alternatives:
// Instead of innerHTML:
element.textContent = untrustedData;
// Instead of document.write:
const el = document.createElement('div');
el.textContent = untrustedData;
document.body.appendChild(el);
// Instead of eval for JSON:
const data = JSON.parse(jsonString);
// Safe attribute setting:
element.setAttribute('data-value', untrustedData);
Content Security Policy (CSP)
CSP is defense-in-depth, not a primary XSS defense. Useful headers:
Content-Security-Policy: default-src 'self'; script-src 'self'; style-src 'self'; img-src 'self' data:; object-src 'none'; base-uri 'self'; frame-ancestors 'none';
Avoid 'unsafe-inline' and 'unsafe-eval' — they negate most CSP XSS protection.
Safe HTML Attributes (allowlist)
align, alt, bgcolor, border, cellpadding, cellspacing, class, color, cols, colspan, coords, dir, face, height, hspace, lang, multiple, nohref, noresize, noshade, nowrap, ref, rel, rev, rows, rowspan, scrolling, shape, span, summary, tabindex, title, usemap, valign, value, vlink, vspace, width.
Never safe: onclick, onerror, onmouseover, onload, onfocus — any on* event handler.
3. OS Command Injection Defense
Defense Priority
- Avoid OS commands entirely — use language APIs
- Parameterize — separate command from arguments
- Allowlist validation — restrict commands and argument values
- Escape — last resort
Language API Alternatives
| Instead of Shell Command | Use Language API |
|---|---|
system("mkdir /dir") |
os.makedirs() (Python), Files.createDirectory() (Java) |
system("rm file") |
os.remove() (Python), Files.delete() (Java) |
system("curl url") |
requests.get() (Python), HttpClient (Java) |
system("ping host") |
InetAddress.isReachable() (Java) |
system("ls /dir") |
os.listdir() (Python), Files.list() (Java) |
Safe Command Execution by Language
Java
// ProcessBuilder separates command from arguments — no shell interpretation
ProcessBuilder pb = new ProcessBuilder("TrustedCmd", "TrustedArg1", "TrustedArg2");
pb.directory(new File("TrustedDir"));
Process p = pb.start();
// DANGEROUS: Runtime.exec with string concatenation
// Runtime.exec() does NOT invoke shell, but still avoid string building
Python
import subprocess
# Safe: list form, no shell=True
result = subprocess.run(
["trusted_cmd", "arg1", "arg2"],
capture_output=True,
text=True,
check=True
)
# DANGEROUS: shell=True with user input
# subprocess.run(f"cmd {user_input}", shell=True) # NEVER DO THIS
Go
import "os/exec"
cmd := exec.Command("trusted_cmd", "arg1", "arg2")
output, err := cmd.Output()
// exec.Command does NOT invoke shell by default
Rust
use std::process::Command;
let output = Command::new("trusted_cmd")
.arg("arg1")
.arg("arg2")
.output()?;
// No shell interpretation — arguments are passed directly
PHP
// escapeshellarg wraps input in single quotes
$url = $_GET['url'];
$command = 'wget --directory-prefix=../temp ' . escapeshellarg($url);
system($command);
// Prefer escapeshellarg over escapeshellcmd
Input Validation for Command Arguments
import re
ALLOWED_PATTERN = re.compile(r'^[a-zA-Z0-9._-]{1,64}$')
def validate_argument(arg: str) -> str:
if not ALLOWED_PATTERN.match(arg):
raise ValueError(f"Invalid argument: {arg}")
return arg
Dangerous metacharacters to block: & | ; $ > < \ \ ! ' " ( )`
Use -- delimiter to signal end of options:
# Prevents argument injection via leading dashes
cmd -- "$user_input"
4. Deserialization Safeguards
Universal Rule
Never deserialize untrusted data in native/binary formats. Prefer pure data formats (JSON, Protocol Buffers, MessagePack with type restrictions).
Language-Specific Risks and Defenses
Java
Detection signatures: Hex AC ED 00 05, Base64 rO0, Content-Type application/x-java-serialized-object
Class allowlisting via ObjectInputStream override:
public class SafeObjectInputStream extends ObjectInputStream {
private static final Set<String> ALLOWED = Set.of(
"com.myapp.dto.UserData",
"com.myapp.dto.OrderData"
);
@Override
protected Class<?> resolveClass(ObjectStreamClass desc)
throws IOException, ClassNotFoundException {
if (!ALLOWED.contains(desc.getName())) {
throw new InvalidClassException("Unauthorized class", desc.getName());
}
return super.resolveClass(desc);
}
}
Prevent deserialization in domain objects:
private final void readObject(ObjectInputStream in) throws IOException {
throw new IOException("Cannot be deserialized");
}
Library safety matrix:
| Library | Safe When | Unsafe When |
|---|---|---|
| jackson-databind | Polymorphism disabled | @JsonTypeInfo with user input |
| XStream >= 1.4.17 | Allowlist configured | Default config |
| fastjson2 | autotype disabled | autotype enabled |
| Kryo >= 5.0.0 | Class registration enabled | Registration disabled |
| fastjson < 1.2.68 | Never safe | Unrestricted deser |
| XMLDecoder | Never safe | Always |
JVM agent hardening (when source changes impossible):
-javaagent:name-of-agent.jar
Agents: rO0 (Contrast Security), NotSoSerial, SerialKiller
Python
NEVER use with untrusted data:
# ALL OF THESE ARE DANGEROUS WITH UNTRUSTED INPUT:
import pickle
pickle.loads(untrusted_data) # Arbitrary code execution
import yaml
yaml.load(untrusted_data) # Arbitrary code execution
import jsonpickle
jsonpickle.decode(untrusted_data) # Arbitrary code execution
Safe alternatives:
import json
data = json.loads(untrusted_data) # Only produces dicts, lists, strings, numbers
import yaml
data = yaml.safe_load(untrusted_data) # Only basic YAML types
# For structured data with validation:
from pydantic import BaseModel
class UserData(BaseModel):
name: str
age: int
validated = UserData.model_validate_json(untrusted_data)
Detection: Pickled data ends with . (unencoded), Base64 starts with gASV
PHP
// DANGEROUS
$data = unserialize($_GET['data']);
// SAFE
$data = json_decode($_GET['data'], true);
C# / .NET
Detection: Base64 starting with AAEAAAD/////, keywords TypeObject, $type:
// NEVER USE:
// BinaryFormatter — fundamentally unsafe per Microsoft guidance
// SAFE: Disable TypeNameHandling in JSON.NET
var settings = new JsonSerializerSettings {
TypeNameHandling = TypeNameHandling.None // DEFAULT — keep it this way
};
var obj = JsonConvert.DeserializeObject<SafeDTO>(json, settings);
// SAFE: DataContractSerializer with known types
var serializer = new DataContractSerializer(typeof(MyDto));
RCE gadget classes to block:
System.Windows.Data.ObjectDataProviderSystem.Configuration.Install.AssemblyInstallerSystem.Management.Automation.PSObjectSystem.Windows.Forms.BindingSource
JavaScript / TypeScript
// SAFE: JSON.parse only produces plain objects
const data = JSON.parse(untrustedString);
// DANGEROUS: eval, Function constructor
// eval(untrustedString); // NEVER
// new Function(untrustedString); // NEVER
// Validate structure after parsing:
import { z } from 'zod';
const UserSchema = z.object({
name: z.string(),
age: z.number().int().positive(),
});
const validated = UserSchema.parse(JSON.parse(untrustedString));
Go
// encoding/json is safe — only produces map[string]interface{}, slices, primitives
var data map[string]interface{}
err := json.Unmarshal(untrustedBytes, &data)
// encoding/gob — use with caution, register types explicitly
gob.Register(SafeType{})
Rust
// serde_json is safe — produces typed structs or serde_json::Value
use serde::Deserialize;
#[derive(Deserialize)]
struct UserData {
name: String,
age: u32,
}
let user: UserData = serde_json::from_str(untrusted)?;
// Type system prevents arbitrary instantiation
Cryptographic Integrity Checks
When native serialization cannot be avoided, sign data at creation and verify before deserialization:
import hmac
import hashlib
import json
SECRET_KEY = b'server-side-secret'
def sign(data: dict) -> str:
payload = json.dumps(data, sort_keys=True)
sig = hmac.new(SECRET_KEY, payload.encode(), hashlib.sha256).hexdigest()
return f"{payload}|{sig}"
def verify_and_load(signed: str) -> dict:
payload, sig = signed.rsplit('|', 1)
expected = hmac.new(SECRET_KEY, payload.encode(), hashlib.sha256).hexdigest()
if not hmac.compare_digest(sig, expected):
raise ValueError("Signature verification failed")
return json.loads(payload)
5. Server-Side Request Forgery (SSRF) Prevention
Defense Layers
Application Layer — URL/IP Validation
Allowlist approach (preferred):
from urllib.parse import urlparse
import ipaddress
import socket
ALLOWED_HOSTS = {"api.trusted-partner.com", "cdn.example.com"}
ALLOWED_SCHEMES = {"http", "https"}
def validate_url(url: str) -> str:
parsed = urlparse(url)
# 1. Scheme restriction
if parsed.scheme not in ALLOWED_SCHEMES:
raise ValueError(f"Disallowed scheme: {parsed.scheme}")
# 2. Host allowlist
if parsed.hostname not in ALLOWED_HOSTS:
raise ValueError(f"Host not in allowlist: {parsed.hostname}")
# 3. Resolve and verify IP is public
ip = socket.getaddrinfo(parsed.hostname, parsed.port or 443)[0][4][0]
addr = ipaddress.ip_address(ip)
if addr.is_private or addr.is_loopback or addr.is_link_local:
raise ValueError(f"Resolved to non-public IP: {ip}")
return url
Denylist (last resort — minimum blocked ranges):
BLOCKED_RANGES = [
ipaddress.ip_network("127.0.0.0/8"), # Localhost
ipaddress.ip_network("10.0.0.0/8"), # RFC1918
ipaddress.ip_network("172.16.0.0/12"), # RFC1918
ipaddress.ip_network("192.168.0.0/16"), # RFC1918
ipaddress.ip_network("169.254.0.0/16"), # Link-local / cloud metadata
ipaddress.ip_network("224.0.0.0/4"), # Multicast
ipaddress.ip_network("::1/128"), # IPv6 localhost
ipaddress.ip_network("fc00::/7"), # IPv6 ULA
ipaddress.ip_network("fe80::/10"), # IPv6 link-local
ipaddress.ip_network("ff00::/8"), # IPv6 multicast
]
def is_blocked(ip_str: str) -> bool:
addr = ipaddress.ip_address(ip_str)
return any(addr in network for network in BLOCKED_RANGES)
Cloud Metadata Protection
| Cloud | Metadata Endpoint | Mitigation |
|---|---|---|
| AWS | 169.254.169.254, metadata.amazonaws.com |
Migrate to IMDSv2 (requires token), disable IMDSv1 |
| GCP | metadata.google.internal, 169.254.169.254 |
Block in firewall + application |
| Azure | 169.254.169.254 |
Block in firewall + application |
AWS IMDSv2 enforcement (Terraform):
resource "aws_instance" "example" {
metadata_options {
http_endpoint = "enabled"
http_tokens = "required" # Enforces IMDSv2
http_put_response_hop_limit = 1
}
}
Network Layer
- Firewall rules: limit vulnerable applications to communicate only with explicitly approved internal services
- Network segregation: isolate applications to prevent lateral movement
Additional Controls
- Disable HTTP redirects in the HTTP client — redirects can bypass validation
- Restrict protocols to HTTP/HTTPS only — block
file://,gopher://,dict://,ftp:// - DNS rebinding protection — resolve hostname, validate IP, then connect using the resolved IP (not hostname)
- Response validation — verify response content-type matches expected format
6. File Upload Security
Validation Checklist
import os
import uuid
import magic # python-magic
ALLOWED_EXTENSIONS = {'.pdf', '.docx', '.png', '.jpg', '.jpeg'}
ALLOWED_MIME_TYPES = {'application/pdf', 'application/vnd.openxmlformats-officedocument.wordprocessingml.document',
'image/png', 'image/jpeg'}
MAX_FILE_SIZE = 10 * 1024 * 1024 # 10 MB
UPLOAD_DIR = '/var/uploads' # Outside webroot
def validate_upload(filename: str, content: bytes) -> str:
# 1. File size check
if len(content) > MAX_FILE_SIZE:
raise ValueError("File too large")
# 2. Extension allowlist (after decoding)
ext = os.path.splitext(filename)[1].lower()
if ext not in ALLOWED_EXTENSIONS:
raise ValueError(f"Disallowed extension: {ext}")
# 3. MIME type via magic bytes (not Content-Type header)
detected_mime = magic.from_buffer(content, mime=True)
if detected_mime not in ALLOWED_MIME_TYPES:
raise ValueError(f"Disallowed MIME type: {detected_mime}")
# 4. Generate safe filename (never use user-provided name)
safe_name = f"{uuid.uuid4().hex}{ext}"
# 5. Store outside webroot
dest = os.path.join(UPLOAD_DIR, safe_name)
# 6. Prevent path traversal
real_dest = os.path.realpath(dest)
if not real_dest.startswith(os.path.realpath(UPLOAD_DIR)):
raise ValueError("Path traversal detected")
return dest
Bypass Prevention
Guard against:
- Double extensions:
malware.php.jpg— validate the final extension AND magic bytes - Null bytes:
malware.php%00.jpg— decode before validation - Case manipulation:
.PhP— normalize to lowercase - Content-Type spoofing: Always verify via magic bytes, never trust the header
Image-Specific Hardening
from PIL import Image
import io
def sanitize_image(content: bytes) -> bytes:
"""Re-encode image to strip embedded payloads (EXIF, steganography, polyglots)."""
img = Image.open(io.BytesIO(content))
output = io.BytesIO()
img.save(output, format=img.format)
return output.getvalue()
Storage Security
| Priority | Strategy |
|---|---|
| 1 | Store on a separate host (dedicated file server) |
| 2 | Store outside the webroot, admin access only |
| 3 | Inside webroot with write-only permissions + IP restrictions |
- Set minimal filesystem permissions (no execute)
- Serve uploaded files through a controller that sets
Content-Disposition: attachment - Never execute uploaded files
- Scan with antivirus before accepting
- Apply Content Disarm & Reconstruct (CDR) for PDF/DOCX
Additional Controls
- Require authentication for upload endpoints
- Implement CSRF tokens on upload forms
- Rate-limit uploads per user/IP
- Validate ZIP contents: check decompressed size before extraction to prevent zip bombs
7. XML External Entity (XXE) Prevention
Universal Rule
Disable DTDs entirely. If DTDs cannot be disabled, disable external entities AND external DTD loading.
Parser Configuration by Language
Java — DocumentBuilderFactory
DocumentBuilderFactory dbf = DocumentBuilderFactory.newInstance();
// Best: disable DTDs entirely
dbf.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
dbf.setXIncludeAware(false);
dbf.setFeature(XMLConstants.FEATURE_SECURE_PROCESSING, true);
// If DTDs required (rare): disable external entities
dbf.setFeature("http://xml.org/sax/features/external-general-entities", false);
dbf.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
dbf.setFeature("http://apache.org/xml/features/nonvalidating/load-external-dtd", false);
Java — XMLInputFactory (StAX)
XMLInputFactory xif = XMLInputFactory.newInstance();
xif.setProperty(XMLInputFactory.SUPPORT_DTD, false);
xif.setProperty(XMLConstants.ACCESS_EXTERNAL_DTD, "");
xif.setProperty("javax.xml.stream.isSupportingExternalEntities", false);
Java — SAXReader (DOM4J)
SAXReader saxReader = new SAXReader();
saxReader.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
saxReader.setFeature("http://xml.org/sax/features/external-general-entities", false);
saxReader.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
Java — TransformerFactory
TransformerFactory tf = TransformerFactory.newInstance();
tf.setAttribute(XMLConstants.ACCESS_EXTERNAL_DTD, "");
tf.setAttribute(XMLConstants.ACCESS_EXTERNAL_STYLESHEET, "");
Java — JAXB Unmarshaller
XMLInputFactory xif = XMLInputFactory.newFactory();
xif.setProperty(XMLInputFactory.SUPPORT_DTD, false);
xif.setProperty(XMLInputFactory.IS_SUPPORTING_EXTERNAL_ENTITIES, false);
XMLStreamReader xsr = xif.createXMLStreamReader(new StreamSource(file));
Unmarshaller um = jc.createUnmarshaller();
um.unmarshal(xsr);
Python
# RECOMMENDED: use defusedxml (safe by default)
from defusedxml import ElementTree as ET
tree = ET.parse('file.xml')
# defusedxml protects against:
# - External entities
# - DTD retrieval
# - Billion Laughs (entity expansion bomb)
# - Quadratic Blowup
# NEVER use xml.etree.ElementTree with untrusted input without defusedxml
# Standard library is vulnerable to Billion Laughs
C# / .NET
// .NET >= 4.5.2: safe by default
XmlDocument xmlDoc = new XmlDocument();
xmlDoc.XmlResolver = null; // Explicit is better
xmlDoc.LoadXml(xml);
// .NET 4.0-4.5.2:
XmlTextReader reader = new XmlTextReader(stream);
reader.DtdProcessing = DtdProcessing.Prohibit;
// Pre-.NET 4.0:
XmlTextReader reader = new XmlTextReader(stream);
reader.ProhibitDtd = true;
PHP
// PHP >= 8.0: safe by default
// PHP < 8.0:
libxml_set_external_entity_loader(null);
Go
// encoding/xml is safe by default — does not process external entities
// No special configuration needed
import "encoding/xml"
err := xml.Unmarshal(data, &result)
C/C++ — libxml2
// libxml2 >= 2.9: XXE disabled by default
// libxml2 < 2.9: avoid these flags:
// XML_PARSE_NOENT (expands entities)
// XML_PARSE_DTDLOAD (loads external DTD)
xmlDocPtr doc = xmlReadMemory(buffer, size, "noname.xml", NULL, 0);
// Do NOT pass XML_PARSE_NOENT
C/C++ — Xerces-C
XercesDOMParser *parser = new XercesDOMParser();
parser->setCreateEntityReferenceNodes(true);
parser->setDisableDefaultEntityResolution(true);
XXE Safety Matrix
| Language/Parser | Default Safe | Action Required |
|---|---|---|
| Java (all parsers) | No | Explicit feature disabling |
| Python xml.etree | Partially (no ext entities, but Billion Laughs) | Use defusedxml |
| .NET >= 4.5.2 | Yes | Set XmlResolver = null explicitly |
| PHP >= 8.0 | Yes | None |
| Go encoding/xml | Yes | None |
| libxml2 >= 2.9 | Yes | Don't pass NOENT flag |
| Rust (quick-xml, xml-rs) | Yes | None |
8. Server-Side Template Injection (SSTI)
Affected Template Engines
| Engine | Language | Vulnerable When |
|---|---|---|
| Jinja2 | Python | User input in template string (not data) |
| Mako | Python | User input in template string |
| Twig | PHP | User input in template string |
| Freemarker | Java | User input in template string |
| Velocity | Java | User input in template string |
| Thymeleaf | Java | User input in template expression |
| Pug/Jade | Node.js | User input in template string |
| ERB | Ruby | User input in template string |
| Handlebars | JavaScript | Custom helpers with user input |
Vulnerable Pattern
# DANGEROUS: user input becomes part of template
from jinja2 import Template
template = Template(f"Hello {user_input}") # SSTI if user_input contains {{ }}
output = template.render()
# SAFE: user input is data, not template structure
from jinja2 import Template
template = Template("Hello {{ name }}")
output = template.render(name=user_input)
Prevention
- Never construct templates from user input — user data goes in template context, not template source
- Use sandboxed template environments:
from jinja2.sandbox import SandboxedEnvironment
env = SandboxedEnvironment()
template = env.from_string("Hello {{ name }}")
- Use logic-less templates (Mustache, Handlebars without custom helpers) — no code execution capability
- Restrict template functionality — disable dangerous built-ins, filters, and global functions
Detection
Test inputs: {{7*7}}, ${7*7}, <%= 7*7 %>, #{7*7}, *{7*7}
If output contains 49, the application is vulnerable to SSTI.
9. Mass Assignment Prevention
The Vulnerability
Frameworks that auto-bind request parameters to model properties allow attackers to set fields they shouldn't (e.g., isAdmin=true, price=0).
Framework-Specific Defenses
Python — Django
# Use explicit field lists in ModelForm
class UserForm(forms.ModelForm):
class Meta:
model = User
fields = ['name', 'email'] # Allowlist — only these can be set
# exclude = ['is_admin'] # Denylist — weaker approach
Python — Pydantic (FastAPI)
from pydantic import BaseModel
class UserCreate(BaseModel):
name: str
email: str
# is_admin is intentionally absent — cannot be set via API
class UserInDB(BaseModel):
name: str
email: str
is_admin: bool = False
Ruby on Rails (Strong Parameters)
def user_params
params.require(:user).permit(:name, :email) # Only these fields allowed
end
Java — Spring MVC
@InitBinder
public void initBinder(WebDataBinder binder) {
binder.setAllowedFields("name", "email"); // Allowlist
// OR
binder.setDisallowedFields("isAdmin", "role"); // Denylist (weaker)
}
Node.js / Express + Mongoose
// Allowlist with lodash/underscore
const _ = require('lodash');
const safeBody = _.pick(req.body, ['name', 'email']);
const user = new User(safeBody);
// Mongoose schema-level protection
const userSchema = new mongoose.Schema({
name: String,
email: String,
isAdmin: { type: Boolean, default: false } // Not exposed to binding
});
PHP — Laravel / Eloquent
class User extends Model {
protected $fillable = ['name', 'email']; // Allowlist
// OR
protected $guarded = ['is_admin', 'role']; // Denylist
}
Go — Use DTOs
// Request DTO — only bindable fields
type CreateUserRequest struct {
Name string `json:"name" validate:"required"`
Email string `json:"email" validate:"required,email"`
}
// Internal model — has privileged fields
type User struct {
ID int
Name string
Email string
IsAdmin bool
}
Architectural Pattern: DTOs
The strongest defense is using separate Data Transfer Objects that only expose editable fields. Map from DTO to domain model explicitly in application code.
10. Unvalidated Redirects
Prevention Strategies (Priority Order)
- Eliminate redirects — remove the functionality if possible
- Never accept user-supplied destination URLs
- Server-side mapping — user provides an ID/token, server maps to URL:
REDIRECT_MAP = {
"dashboard": "/app/dashboard",
"profile": "/app/profile",
"settings": "/app/settings",
}
def safe_redirect(key: str) -> str:
url = REDIRECT_MAP.get(key)
if url is None:
raise ValueError("Invalid redirect target")
return url
- Allowlist validation — if user URLs are required:
from urllib.parse import urlparse
ALLOWED_HOSTS = {"www.mysite.com", "app.mysite.com"}
def validate_redirect(url: str) -> str:
parsed = urlparse(url)
if parsed.hostname not in ALLOWED_HOSTS:
raise ValueError("Redirect to untrusted host")
if parsed.scheme not in ("https",):
raise ValueError("Insecure redirect scheme")
return url
- Interstitial page — warn users they are leaving the site, display destination clearly
11. Error Handling & Secure Logging
Error Handling Rules
Display to users:
- Generic message: "An error occurred, please retry"
- Appropriate HTTP status code (4xx for client errors, 5xx for server errors)
- No implementation details
Log server-side:
- Full exception with stack trace
- Request context, parameters, timestamps
- User identity (without credentials)
- Diagnostic details for investigation
Java (Spring)
@RestControllerAdvice
public class GlobalExceptionHandler {
private static final Logger log = LoggerFactory.getLogger(GlobalExceptionHandler.class);
@ExceptionHandler(Exception.class)
public ResponseEntity<ProblemDetail> handleAll(Exception ex, HttpServletRequest req) {
log.error("Unhandled exception on {} {}", req.getMethod(), req.getRequestURI(), ex);
ProblemDetail problem = ProblemDetail.forStatus(HttpStatus.INTERNAL_SERVER_ERROR);
problem.setTitle("Internal Server Error");
problem.setDetail("An unexpected error occurred");
// DO NOT: problem.setDetail(ex.getMessage());
return ResponseEntity.status(500).body(problem);
}
}
Python (FastAPI)
import logging
from fastapi import FastAPI, Request
from fastapi.responses import JSONResponse
logger = logging.getLogger(__name__)
app = FastAPI()
@app.exception_handler(Exception)
async def global_exception_handler(request: Request, exc: Exception):
logger.error(
"Unhandled exception on %s %s",
request.method, request.url.path,
exc_info=exc
)
return JSONResponse(
status_code=500,
content={"detail": "An unexpected error occurred"}
)
Secure Logging Practices
What to Log
- Authentication successes and failures
- Authorization failures
- Input validation failures
- Session management failures
- Application errors and system events
- Encryption key usage/rotation
- Data import/export operations
- File uploads
- Suspicious business logic (out-of-order actions)
What NOT to Log
- Passwords or credentials
- Session tokens or API keys
- Database connection strings
- Encryption keys or secrets
- Payment card data (PCI DSS)
- Health data (HIPAA)
- Government IDs (SSN, passport numbers)
- Source code
- Data classified higher than the logging system's security level
Log Injection Prevention
import re
def sanitize_log_input(value: str) -> str:
"""Remove control characters that enable log injection."""
return re.sub(r'[\r\n\t]', ' ', value)
# Usage:
logger.info("User login attempt: user=%s", sanitize_log_input(username))
For structured logging (preferred — eliminates injection by design):
import structlog
log = structlog.get_logger()
log.info("login_attempt", username=username, ip=request.client.host)
# Output: {"event": "login_attempt", "username": "...", "ip": "..."}
Log Format — Required Fields
| Category | Fields |
|---|---|
| When | ISO 8601 timestamp, event time, correlation ID |
| Where | Application name+version, hostname, service, URL, method |
| Who | User identity, source IP, user type |
| What | Event type, severity, security flag, status (success/fail), description |
Log Protection
- Integrity: Tamper detection, write to append-only/WORM storage
- Access control: Separate DB account, restricted file permissions, audit log access
- Transport: TLS for log shipping, verify origin
- Storage: Separate partition from OS/application, never web-accessible
- Time sync: NTP across all servers for correlation
12. Input Validation Fundamentals
Principles
- Server-side validation is mandatory — client-side is UX, not security
- Allowlist over denylist — define what IS valid, reject everything else
- Syntactic + Semantic — correct format AND correct business meaning
- Input validation is a secondary defense — never a substitute for parameterized queries, output encoding, etc.
Techniques
Type Coercion
# Force type conversion — rejects non-numeric input
try:
user_id = int(request.args['id'])
except (ValueError, TypeError):
raise ValidationError("ID must be an integer")
int userId = Integer.parseInt(request.getParameter("id"));
// Throws NumberFormatException for non-integer input
Regex Validation
import re
# Anchor both ends, avoid .* wildcards
ZIP_PATTERN = re.compile(r'^\d{5}(-\d{4})?$')
EMAIL_LOCAL = re.compile(r'^[a-zA-Z0-9._%+-]{1,64}$')
def validate_zip(value: str) -> str:
if not ZIP_PATTERN.match(value):
raise ValueError("Invalid zip code format")
return value
ReDoS prevention: Avoid nested quantifiers like (a+)+, (a|a)*, (.*a){10}. Test regex with worst-case input.
Canonicalization
import unicodedata
def canonicalize(value: str) -> str:
"""Normalize Unicode to NFC form before validation."""
return unicodedata.normalize('NFC', value)
Always canonicalize BEFORE validation — different Unicode representations of the same character can bypass filters.
Email Validation
import re
def validate_email(email: str) -> bool:
"""Basic structural validation — NOT a substitute for confirmation email."""
if len(email) > 254:
return False
parts = email.split('@')
if len(parts) != 2:
return False
local, domain = parts
if len(local) > 63 or len(local) == 0:
return False
if not re.match(r'^[a-zA-Z0-9.-]+$', domain):
return False
if '..' in domain or domain.startswith('.') or domain.endswith('.'):
return False
return True
# Semantic validation: send confirmation email with:
# - Cryptographically random token (>= 32 chars)
# - Time-limited (8 hours max)
# - Single-use
13. Memory Safety — C/C++ and Rust
C/C++ Vulnerability Classes
Buffer Overflow
// VULNERABLE
char buf[64];
strcpy(buf, user_input); // No bounds check
// SAFE
char buf[64];
strncpy(buf, user_input, sizeof(buf) - 1);
buf[sizeof(buf) - 1] = '\0';
// BETTER (C11)
char buf[64];
errno_t err = strncpy_s(buf, sizeof(buf), user_input, _TRUNCATE);
// BEST: use std::string in C++
std::string safe = user_input; // Automatically managed
Use-After-Free
// VULNERABLE
char *ptr = malloc(64);
free(ptr);
printf("%s", ptr); // UAF — undefined behavior
// SAFE: nullify after free
free(ptr);
ptr = NULL;
// BEST: use RAII in C++ (unique_ptr, shared_ptr)
auto ptr = std::make_unique<char[]>(64);
// Automatically freed when ptr goes out of scope
Integer Overflow
// VULNERABLE
size_t total = count * sizeof(struct item); // Can overflow
void *buf = malloc(total);
// SAFE: check before arithmetic
if (count > SIZE_MAX / sizeof(struct item)) {
return ERROR_OVERFLOW;
}
size_t total = count * sizeof(struct item);
// C++ safe alternative
#include <stdexcept>
size_t safe_multiply(size_t a, size_t b) {
if (a != 0 && b > SIZE_MAX / a) {
throw std::overflow_error("Integer overflow");
}
return a * b;
}
Format String
// VULNERABLE
printf(user_input); // Attacker controls format string — read/write arbitrary memory
// SAFE
printf("%s", user_input); // Always use format specifier
// SAFER: use puts() for simple string output
fputs(user_input, stdout);
C/C++ Compiler Hardening Flags
# GCC/Clang hardening flags
CFLAGS += -Wall -Wextra -Werror # Treat warnings as errors
CFLAGS += -Wformat=2 -Wformat-security # Format string checks
CFLAGS += -Wstack-protector # Stack protector warnings
CFLAGS += -fstack-protector-strong # Stack canaries
CFLAGS += -fstack-clash-protection # Stack clash protection
CFLAGS += -fcf-protection # Control flow integrity
CFLAGS += -D_FORTIFY_SOURCE=2 # Runtime buffer overflow checks
CFLAGS += -D_GLIBCXX_ASSERTIONS # C++ stdlib assertions
CFLAGS += -fPIE # Position independent executable
LDFLAGS += -pie # PIE linking
LDFLAGS += -Wl,-z,relro,-z,now # Full RELRO — GOT protection
LDFLAGS += -Wl,-z,noexecstack # Non-executable stack
LDFLAGS += -Wl,-z,separate-code # Separate code segment
C/C++ Static Analysis Tools
| Tool | Type | Key Capability |
|---|---|---|
| clang-tidy | Static | Modernization + security checks |
| cppcheck | Static | Buffer overflows, leaks, UB |
| Coverity | Static (commercial) | Deep path-sensitive analysis |
| ASan (AddressSanitizer) | Dynamic | UAF, buffer overflow, leaks |
| MSan (MemorySanitizer) | Dynamic | Uninitialized memory reads |
| TSan (ThreadSanitizer) | Dynamic | Data races |
| UBSan | Dynamic | Undefined behavior |
| Valgrind | Dynamic | Memory errors, leaks |
Sanitizer usage:
# Compile with sanitizers for testing (NOT production)
clang -fsanitize=address,undefined -fno-omit-frame-pointer -g -o test test.c
# Run — sanitizer reports errors at runtime
./test
Rust Memory Safety
Rust prevents memory safety bugs at compile time through its ownership system. The type system eliminates:
- Buffer overflows (bounds-checked access)
- Use-after-free (ownership + borrowing rules)
- Double-free (single owner)
- Data races (Send/Sync traits + borrow checker)
- Null pointer dereference (Option
instead of null) - Dangling pointers (lifetime system)
Safe Patterns
// Bounds-checked access — panics on out-of-bounds, never corrupts memory
let v = vec![1, 2, 3];
let val = v[index]; // Panics if index >= 3
// Safe alternative — returns Option
let val = v.get(index); // Returns None if out-of-bounds
// Ownership prevents use-after-free
let s = String::from("hello");
let s2 = s; // s is moved — using s after this line is a compile error
// println!("{}", s); // COMPILE ERROR: value borrowed after move
// Borrowing prevents data races
let mut data = vec![1, 2, 3];
let r1 = &data; // Immutable borrow
let r2 = &data; // Multiple immutable borrows OK
// let r3 = &mut data; // COMPILE ERROR: can't mutably borrow while immutably borrowed
// Thread safety
use std::sync::{Arc, Mutex};
let shared = Arc::new(Mutex::new(vec![1, 2, 3]));
let clone = Arc::clone(&shared);
std::thread::spawn(move || {
let mut data = clone.lock().unwrap();
data.push(4);
});
Minimizing unsafe
From the safety-dance project findings:
- Audit every
unsafeblock — many are unnecessary - Benchmark before assuming unsafe is faster — safe code is often equally fast or faster (miniz_oxide achieved faster-than-C performance with zero unsafe blocks)
- Use
cargo geigerto identify unsafe usage across dependencies - Use
cargo clippyto detect unsafe antipatterns - When unsafe is required (FFI, raw hardware access), isolate it behind safe abstractions:
// Isolate unsafe behind a safe public API
pub fn safe_wrapper(data: &[u8]) -> Result<Output, Error> {
// Validate preconditions BEFORE entering unsafe
if data.len() < MINIMUM_SIZE {
return Err(Error::TooSmall);
}
// SAFETY: data.len() >= MINIMUM_SIZE verified above,
// and ffi_function reads at most MINIMUM_SIZE bytes
let result = unsafe { ffi_function(data.as_ptr(), data.len()) };
// Validate postconditions AFTER unsafe
if result.is_null() {
return Err(Error::FfiFailure);
}
Ok(Output::from_raw(result))
}
- Document every
unsafeblock with a// SAFETY:comment explaining the invariant - Known crate audit results — the safety-dance project found unsound unsafe in spin::RwLock (completely rewritten), libflate (4 unsound blocks), and image crate
Safe Crate Recommendations
| Need | Crate | Notes |
|---|---|---|
| HTTP client | reqwest | Safe, async, TLS |
| Serialization | serde + serde_json | No unsafe in user-facing API |
| Crypto | ring, rustls | Audited, minimal unsafe |
| HTML sanitization | ammonia | Safe by default |
| Regex | regex | No unsafe, guaranteed linear time |
| Argument parsing | clap | Safe |
| Password hashing | argon2 | Safe wrapper over C |
14. Fuzzing Infrastructure
OSS-Fuzz (Google)
Continuous fuzzing for open-source projects. Key components:
- Fuzzing engines: libFuzzer, AFL++, Honggfuzz
- Sanitizers: ASan, MSan, UBSan, TSan — detect bugs at runtime during fuzzing
- Languages supported: C/C++, Rust, Go, Python, Java/JVM, Swift, JavaScript
Writing Fuzz Targets
C/C++ (libFuzzer)
#include <stdint.h>
#include <stddef.h>
// Entry point — receives random bytes from fuzzer
int LLVMFuzzerTestOneInput(const uint8_t *data, size_t size) {
// Call the function under test with fuzz input
parse_input(data, size);
return 0;
}
Compile: clang -fsanitize=fuzzer,address -o fuzz_target fuzz_target.c lib.c
Rust (cargo-fuzz / libFuzzer)
// fuzz/fuzz_targets/fuzz_parser.rs
#![no_main]
use libfuzzer_sys::fuzz_target;
fuzz_target!(|data: &[u8]| {
if let Ok(input) = std::str::from_utf8(data) {
let _ = my_crate::parse(input);
}
});
Run: cargo fuzz run fuzz_parser
Go (native fuzzing, Go 1.18+)
func FuzzParse(f *testing.F) {
f.Add("seed_input")
f.Fuzz(func(t *testing.T, input string) {
result, err := Parse(input)
if err != nil {
return // Expected — parser rejected invalid input
}
// Verify invariants on valid parses
if result.Len() < 0 {
t.Error("negative length")
}
})
}
Run: go test -fuzz=FuzzParse
Python (Atheris)
import atheris
import sys
def test_one_input(data: bytes):
fdp = atheris.FuzzedDataProvider(data)
input_str = fdp.ConsumeUnicodeNoSurrogates(100)
my_module.parse(input_str)
atheris.Setup(sys.argv, test_one_input)
atheris.Fuzz()
Fuzzing Best Practices
- Start with parsing code — highest bug yield
- Provide seed corpus — real-world examples improve coverage
- Enable all sanitizers during fuzzing (ASan + UBSan minimum)
- Fuzz continuously — many bugs require billions of iterations
- Minimize crashing inputs —
cargo fuzz tminor libFuzzer's-minimize_crash=1 - Fuzz untrusted input boundaries — network parsers, file format readers, deserializers
Quick Reference Matrix
Vulnerability -> Primary Defense by Language
| Vulnerability | Python | Java | Go | Rust | JavaScript/TS | C# | C/C++ | PHP |
|---|---|---|---|---|---|---|---|---|
| SQLi | DB-API params | PreparedStatement | database/sql params | sqlx bind | pg parameterized | SqlParameter | N/A (use higher-level) | PDO bindParam |
| XSS | html.escape / Jinja2 auto | OWASP Encoder | html/template | askama / ammonia | DOMPurify / textContent | Razor auto | N/A (use higher-level) | htmlspecialchars |
| Command Injection | subprocess (list, no shell) | ProcessBuilder | exec.Command | std::process::Command | child_process.execFile | Process.Start | execvp (no shell) | escapeshellarg |
| Deserialization | json / yaml.safe_load | Class allowlist / Jackson | encoding/json | serde_json | JSON.parse + Zod | TypeNameHandling.None | N/A | json_decode |
| SSRF | URL allowlist + IP check | URL allowlist + IP check | URL allowlist + IP check | URL allowlist + IP check | URL allowlist + IP check | URL allowlist + IP check | URL allowlist + IP check | URL allowlist + IP check |
| XXE | defusedxml | Disable DTD features | Safe by default | Safe by default | Safe (no XML usually) | XmlResolver = null | No NOENT flag | libxml_set_external_entity_loader(null) |
| File Upload | magic bytes + ext allowlist | Apache Tika + ext allowlist | magic bytes + ext allowlist | magic bytes + ext allowlist | multer + magic + ext allowlist | magic bytes + ext allowlist | magic bytes + ext allowlist | finfo + ext allowlist |
| SSTI | SandboxedEnvironment | Sandboxed Freemarker | html/template (logic-less) | N/A (compiled templates) | Logic-less templates | Razor (safe by default) | N/A | Twig sandbox |
| Mass Assignment | Django fields / Pydantic DTO | @InitBinder allowlist | DTO structs | DTO structs | _.pick / Zod | Binding attributes | N/A | $fillable |
| Memory Safety | N/A (managed) | N/A (managed) | N/A (GC + bounds) | Ownership system | N/A (managed) | N/A (managed) | Hardening flags + sanitizers | N/A (managed) |
References
- OWASP Cheat Sheet Series: https://cheatsheetseries.owasp.org/
- OWASP SQL Injection Prevention Cheat Sheet
- OWASP Cross-Site Scripting Prevention Cheat Sheet
- OWASP OS Command Injection Defense Cheat Sheet
- OWASP Deserialization Cheat Sheet
- OWASP SSRF Prevention Cheat Sheet
- OWASP File Upload Cheat Sheet
- OWASP XXE Prevention Cheat Sheet
- OWASP Injection Prevention Cheat Sheet
- OWASP Query Parameterization Cheat Sheet
- OWASP Mass Assignment Cheat Sheet
- OWASP Unvalidated Redirects and Forwards Cheat Sheet
- OWASP Error Handling Cheat Sheet
- OWASP Logging Cheat Sheet
- OWASP Input Validation Cheat Sheet
- OWASP DOM-based XSS Prevention Cheat Sheet
- rust-secure-code/safety-dance: https://github.com/rust-secure-code/safety-dance
- Google OSS-Fuzz: https://github.com/google/oss-fuzz
- pwndbg: https://github.com/pwndbg/pwndbg