Reverse engineering is the process of analyzing software to understand how it works when source code is unavailable or to study compiled artifacts. Python is a common target for reverse engineering because its bytecode and packaging formats (like .pyc and PyInstaller bundles) are relatively accessible.

Introduction

Reverse engineering is the process of analyzing software to understand how it works when source code is unavailable or to study compiled artifacts. Python is a common target for reverse engineering because its bytecode and packaging formats (like .pyc and PyInstaller bundles) are relatively accessible. This guide aims to introduce practical, ethical techniques for analyzing Python applications — useful for security research, learning, and CTFs.

Important — Ethics & Legal Boundaries

Reverse engineering is powerful but can be illegal or unethical if misused. Always:

Work only on software you own or have explicit permission to analyze.

Use examples for education, security research, or authorized assessments.

Avoid analyzing proprietary or personal-data-containing software without consent.

Why Python?

Python source code is typically human-readable when available; compiled bytecode (.pyc) is decompilable.

Tools exist to extract bytecode from bundled executables (PyInstaller, etc.).

The dis module and code object attributes (co_consts, co_names) make bytecode inspection straightforward.

Core Tools

dis (Python standard library) — inspect bytecode.

uncompyle6 / decompyle3 — decompile .pyc files to high-level Python.

pyinstxtractor.py — extract .pyc files from PyInstaller bundles.

strings, hexdump, binwalk — for static analysis.

Dynamic tools: pdb, frida, gdb, ptrace for deeper runtime inspection and hooking.

Isolated environments / VMs / sandboxes — to safely run and observe targets.

Step-by-step: A Simple Example

Consider this small Python program (our analysis target):

# secret_app.py
def _hidden(x):
    return x * 42

def main():
    s = "secret"
    print(_hidden(len(s)))

if __name__ == "__main__":
    main()

Static analysis — if source is available:

Read the code to identify functions, constants, and logic.

Working with .pyc files:

Python compiles .py into .pyc. Generate one with:

python -m py_compile secret_app.py

Inspect bytecode:

import dis
import secret_app
dis.dis(secret_app._hidden)

This reveals opcodes and control flow.

Decompilation:

Use uncompyle6 or decompyle3:

uncompyle6 secret_app.pyc

You’ll often get readable Python code, especially for non-obfuscated programs.

PyInstaller-packed Applications

If the app was packaged with PyInstaller, use pyinstxtractor.py to extract embedded .pyc files:

python pyinstxtractor.py target.exe
# Look in the created folder for .pyc files
uncompyle6 extracted/XYZ.pyc > recovered.py

Dynamic Analysis Tips

Run the program with controlled inputs and observe behavior.

Use pdb for interactive breakpoints: import pdb; pdb.set_trace().

Insert logging or prints if you can modify runtime code (or patch the bytecode).

Use a sandbox or VM to safely monitor filesystem, network, and process activity.

Use runtime hooking (Frida) to intercept calls and inspect in-memory state.

Bytecode-level Hints

Inspect code object internals:

f = secret_app._hidden
print(f.__code__.co_consts)
print(f.__code__.co_names)

dis shows opcodes — helpful when decompilation fails or is obfuscated.

Look for string constants and imported module names in co_consts and co_names.

Common Obfuscation Techniques & How to Approach Them

Identifier renaming (mangling function and variable names).

String encryption (strings stored encrypted and decoded at runtime).

Runtime code generation (exec/eval).

Native extensions (Cython, compiled C modules) which require different tools (Ghidra, IDA).

Quick Checklist

Is source available? (Search repos, pip caches, etc.)

If binary: extract .pyc (pyinstxtractor or manual).

Try uncompyle6 / decompyle3.

Inspect with dis and code object attributes.

Perform dynamic analysis with pdb, Frida, or sandboxing.

For obfuscation: locate runtime decoding/deobfuscation code and target it.

Advanced Topics (Next Steps)

Native extension analysis (Ghidra/IDA for .so/.pyd files).

Handling Cython or PyOxidized builds.

Building automation scripts for mass .pyc extraction and scanning.

Using Frida to hook Python C-API functions at runtime.

Resources & Learning Paths

Python docs (dis, inspect).

uncompyle6 and decompyle3 repositories and docs.

PyInstaller internals and pyinstxtractor.

CTF challenges and practice repos for hands-on experience.

Create your own small Python apps, package them, and practice extracting and decompiling.

Conclusion

Python reverse engineering offers an accessible entry point for learning program analysis and security research. Start with small, permissioned targets, learn the tooling (dis, decompilers, extractors), and gradually tackle obfuscated or compiled targets with dynamic analysis techniques. If you want, I can walk you through a live, hands-on PoC on a sample PyInstaller binary (for learning and authorized practice).