Python security patterns

Python’s standard library and common packages have well-known unsafe defaults. Most are documented; some are still encountered in production code years after the safer alternative became available.

YAML parsing

PyYAML’s yaml.load() with no explicit Loader executes arbitrary Python when it encounters !!python/object and related tags. A crafted YAML document can run any code the Python process can run:

import yaml

# unsafe: executes Python via !!python/object tags
data = yaml.load(user_input)

# safe: restricted parser, no code execution
data = yaml.safe_load(user_input)

Since PyYAML 6.0 yaml.load() without an explicit Loader emits a warning. Passing Loader=yaml.FullLoader is not sufficient for untrusted input; SafeLoader (used by yaml.safe_load()) restricts parsing to scalar types and collections.

pickle

Python’s pickle module can execute arbitrary code during deserialization. The __reduce__ method on a class controls how instances are serialized and deserialized; an attacker who controls pickle bytes controls what runs:

import pickle


# unsafe: any code the attacker encodes runs here
def load_from_cache(data: bytes):
    return pickle.loads(data)

Insecure deserialisation covers the attack patterns built on this execution primitive.

JSON carries no executable semantics and is the appropriate format for data that crosses trust boundaries:

import json


def load_from_cache(data: str) -> dict:
    return json.loads(data)

When typed Python objects are needed across a boundary, deserializing from JSON into a Pydantic model gives both structure and validation without execution risk.

subprocess and shell invocation

subprocess.run() with a list does not invoke a shell. Shell metacharacters in user-supplied arguments are treated as literal strings:

import subprocess

# safe: list form, no shell
subprocess.run(["convert", input_path, output_path], check=True, capture_output=True)

# unsafe: shell=True passes the string to /bin/sh
subprocess.run(f"convert {input_path} {output_path}", shell=True)

os.system() and os.popen() always invoke a shell. Both carry the same injection risk as subprocess.run(..., shell=True). When a shell is genuinely required (for pipes or shell built-ins), shlex.quote() wraps each user-controlled value:

import shlex, subprocess

cmd = f"grep {shlex.quote(pattern)} {shlex.quote(logfile)}"
result = subprocess.run(cmd, shell=True, capture_output=True, text=True)

Prefer the list form where the command structure allows it.

Jinja2 template injection

Environment(autoescape=True) is the safe default for HTML templates; it escapes output at render time. The dangerous pattern is passing user-controlled content as a template string rather than as a variable:

from jinja2 import Environment

env = Environment(autoescape=True)

# safe: user_value is a variable, not part of the template
template = env.from_string("Hello, {{ name }}!")
output = template.render(name=user_value)

# unsafe: user controls the template itself; Jinja2 expressions execute before escaping
template = env.from_string(user_supplied_template)
output = template.render()

autoescape=True does not protect against template injection because the payload runs before escaping is applied. The fix is to ensure no user-controlled data reaches from_string() or render_template_string(). The server-side template injection attack page covers the exploitation side.

TLS verification

requests.get(url, verify=False) disables TLS certificate verification entirely. The connection is encrypted, but the application accepts any certificate, including those from a man-in-the-middle. This pattern appears in development to skip certificate errors and tends to stay in production:

import requests

# unsafe: accepts any certificate
response = requests.get(url, verify=False)

# safe: default behaviour verifies the certificate chain
response = requests.get(url)

If the problem is a self-signed or internal CA certificate, the correct fix is to pass the CA bundle path as verify="/path/to/ca-bundle.pem", not to disable verification.

eval and safe alternatives

eval() executes an arbitrary Python expression. ast.literal_eval() evaluates Python literals only (strings, numbers, lists, dicts, tuples, booleans, None) and raises on anything else:

import ast

# unsafe: runs any Python expression
result = eval(user_input)

# safe: literals only, no function calls or imports
result = ast.literal_eval(user_input)

ast.literal_eval() is appropriate for deserializing simple Python-format data (configuration values, structured constants). It is not a general-purpose expression evaluator; for that, a purpose-built library with explicit resource limits is more appropriate.