Security Guide

Complete guide to security features and best practices in sqlo.

Security Features

sqlo implements multiple layers of security to prevent SQL injection attacks:

  1. Automatic parameter binding for all values

  2. Mandatory identifier validation for table and column names

  3. UPDATE/DELETE safety checks to prevent accidental mass operations

  4. Type-safe Raw() expressions with clear documentation

  5. Empty list handling for safe WHERE IN operations


Identifier Validation

Automatic Validation

All table and column names are automatically validated to prevent SQL injection:

from sqlo import Q

# ✅ Valid identifiers are allowed
Q.select("id", "name").from_("users").build()
Q.select("*").from_("mydb.users").build()  # schema.table format

# ❌ Invalid identifiers are rejected
Q.select("*").from_("users; DROP TABLE users;").build()
# Raises: ValueError: Invalid identifier 'users; DROP TABLE users;'

What’s Allowed

Valid identifiers must:

  • Start with a letter or underscore

  • Contain only letters, numbers, underscores, and dots

  • Not contain spaces, special characters, or SQL keywords in vulnerable positions

Examples:

  • users

  • user_accounts

  • mydb.users (schema.table)

  • users.id (table.column)

  • users; DROP TABLE (semicolon)

  • users--comment (SQL comment)

  • users /*comment*/ (SQL comment)

  • users OR 1=1 (SQL injection attempt)

Table Aliases

Table aliases with spaces are supported:

# ✅ Allowed - validates "orders" part
Q.select("*").from_("users u").join("orders o", "u.id = o.user_id").build()

Parameter Binding

Automatic Protection

All values are automatically parameterized to prevent SQL injection:

# ✅ SAFE - value is parameterized
malicious_input = "admin' OR '1'='1"
query = Q.select("*").from_("users").where("email", malicious_input)
sql, params = query.build()
# SQL: SELECT * FROM `users` WHERE `email` = %s
# Params: ("admin' OR '1'='1",)  # Treated as literal string

The malicious SQL is treated as a string value, not as executable SQL code.


Raw SQL Expressions

When to Use Raw()

Use Raw() for SQL expressions that cannot be represented as identifiers or values:

from sqlo import Raw

# ✅ SQL functions
Q.select(Raw("COUNT(*)")).from_("users")
Q.select(Raw("CONCAT(first_name, ' ', last_name)")).from_("users")

# ✅ SQL expressions in WHERE
Q.select("*").from_("users").where(Raw("created_at > NOW() - INTERVAL 7 DAY"))

# ✅ Raw with parameters (still safe!)
Q.select("*").from_("users").where(Raw("age > %s", [18]))

Safety Warning

[!CAUTION] Never use user input directly in Raw() - this bypasses all security protections!

# ❌ DANGEROUS - SQL injection vulnerability!
user_input = request.args.get("condition")
Q.select("*").from_("users").where(Raw(user_input))  # NEVER DO THIS!

# ✅ SAFE - validate/whitelist first, or use parameterized queries
allowed_fields = {"name", "email", "age"}
if field in allowed_fields:
    Q.select("*").from_("users").where(field, user_value)

Type Checking

Raw() requires a string argument:

from sqlo import Raw

# ✅ Valid
Raw("COUNT(*)")

# ❌ Invalid - raises TypeError
Raw(123)  # TypeError: Raw SQL must be a string

UPDATE/DELETE Safety

Mandatory WHERE Clause

To prevent accidental mass operations, UPDATE and DELETE require either:

  1. A WHERE clause, or

  2. Explicit .allow_all_rows() call

# ❌ Raises ValueError - no WHERE clause
Q.update("users").set({"active": False}).build()
# ValueError: UPDATE without WHERE clause would affect all rows

# ✅ With WHERE clause
Q.update("users").set({"active": False}).where("id", 123).build()

# ✅ Explicit confirmation for mass operation
Q.update("users").set({"active": False}).allow_all_rows().build()

The .allow_all_rows() Method

Use this method when you intentionally want to affect all rows:

from sqlo import Q

# Mass update
Q.update("settings").set({"migrated": True}).allow_all_rows().build()

# Mass delete
Q.delete_from("temp_logs").allow_all_rows().build()

# With ORDER BY and LIMIT (still requires allow_all_rows without WHERE)
Q.delete_from("logs").order_by("-created_at").limit(1000).allow_all_rows().build()

Empty List Handling

WHERE IN with Empty Lists

Empty lists in WHERE IN clauses are automatically handled:

# Empty IN list generates FALSE
sql, params = Q.select("*").from_("users").where_in("id", []).build()
# SQL: SELECT * FROM `users` WHERE FALSE
# Params: ()

# Empty NOT IN list generates TRUE
sql, params = Q.select("*").from_("users").where_not_in("id", []).build()
# SQL: SELECT * FROM `users` WHERE TRUE
# Params: ()

This ensures syntactically correct SQL even with dynamic lists.


NULL Value Handling

Correct NULL Checks

Use dedicated methods for NULL checks:

from sqlo import Q, Condition

# ✅ Correct ways to check for NULL
Q.select("*").from_("users").where_null("email")  # Simplest
Q.select("*").from_("users").where(Condition.null("email"))  # Type-safe
Q.select("*").from_("users").where(Condition("email", operator="IS NULL"))

# ❌ WRONG - this checks for the STRING 'NULL', not SQL NULL
Q.select("*").from_("users").where("email", "NULL")
Q.select("*").from_("users").where(Condition("email", "NULL", "IS"))

NULL in INSERT/UPDATE

NULL values in data dictionaries are properly handled:

# ✅ NULL values work correctly
Q.insert_into("users").values({"name": "Alice", "email": None})
# email will be set to NULL

Q.update("users").set({"email": None}).where("id", 123)
# email will be set to NULL

Best Practices

1. Always Validate Dynamic Identifiers

When allowing users to specify table or column names:

from sqlo import Q

ALLOWED_TABLES = {"users", "orders", "products"}
ALLOWED_COLUMNS = {"id", "name", "email", "created_at"}

def dynamic_query(table: str, columns: list[str]):
    # Validate inputs
    if table not in ALLOWED_TABLES:
        raise ValueError(f"Table '{table}' not allowed")
    
    if not all(col in ALLOWED_COLUMNS for col in columns):
        raise ValueError("Invalid column name")
    
    # Safe to use
    return Q.select(*columns).from_(table)

2. Use Parameterized Queries for Values

Never concatenate user input into SQL strings:

# ❌ DANGEROUS
user_email = request.form.get("email")
Q.select("*").from_("users").where(Raw(f"email = '{user_email}'"))

# ✅ SAFE
Q.select("*").from_("users").where("email", user_email)

3. Use Dedicated Methods

Prefer dedicated methods over generic ones:

# Instead of Raw() or complex Conditions
Q.select("*").from_("users").where_null("deleted_at")  # Clear intent
Q.select("*").from_("users").where_in("status", ["active", "pending"])
Q.update("users").set({"active": False}).where("id", 123)

4. Verify Before Mass Operations

Always double-check before calling .allow_all_rows():

def archive_old_data(cutoff_date: str):
    """Archive data older than cutoff_date."""
    # Add confirmation in your application layer
    query = (
        Q.update("orders")
        .set({"archived": True})
        .where("created_at <", cutoff_date)
    )
    # Only use allow_all_rows() if truly necessary
    return query

Security Checklist

When building queries with sqlo:

  • ✅ Table and column names are validated automatically

  • ✅ User values are parameterized automatically

  • Raw() is only used with trusted, hardcoded SQL

  • ✅ UPDATE and DELETE have WHERE clauses or explicit .allow_all_rows()

  • ✅ NULL checks use .where_null() or Condition.null()

  • ✅ Dynamic identifiers (table/column names) are whitelisted

  • ✅ Empty lists in WHERE IN are handled correctly

  • ✅ Type checking is enabled (Raw() rejects non-strings)


Common Vulnerabilities and Prevention

1. Table Name Injection

Vulnerability:

# ❌ Old approach - vulnerable
user_table = request.args.get("table")
Q.select("*").from_(user_table)  # Could be "users; DROP TABLE users;--"

Protection:

# ✅ Now automatically blocked
Q.select("*").from_("users; DROP TABLE users;--")
# Raises: ValueError: Invalid identifier

2. Column Name Injection

Protection:

# Automatically validates all column names
Q.select("id; DROP TABLE users;--").from_("users")
# Raises: ValueError: Invalid identifier

3. WHERE Clause Injection

Protection:

# Values are automatically parameterized
malicious = "1' OR '1'='1"
Q.select("*").from_("users").where("name", malicious)
# Safe - malicious string is treated as a literal value

Additional Resources