Skip to content

Open-source forensic imaging & file-carving suite that recovers 40+ formats (JPEG, PNG, PDF, DOCX, MP4, ZIP, WAV, etc.) from raw disks, USB sticks, SD cards or disk-images. Creates bit-for-bit copies, SHA-256/MD5/SHA-1 triple hashes, logs chain-of-custody, exports deleted entries via The Sleuth Kit and performs multi-threaded signature-based .

License

Notifications You must be signed in to change notification settings

alok-kumar8765/Data_Recovery_Using_Python

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ•΅οΈβ€β™‚οΈ FBI-Level Data-Recovery Toolkit (Community Edition)

License: MIT Python 3.8+ Platform Tests Code style: black


πŸ“‘ Table of Contents

  1. Quick Peek
  2. What & Why
  3. Real-World Use-Cases
  4. Architecture & Data-Flow Diagrams
  5. Installation
  6. Usage Examples
  7. Code Walk-Through
  8. Pros & Cons
  9. Road-Map
  10. Disclaimer & Legal

Quick Peek

# 1. image a suspect USB stick
sudo python recover.py /dev/sdb --image --carve -o case042/

# 2. review report
open case042/report.html

Recovers 40+ file types, hashes everything, logs chain-of-custody, and produces a ready-to-submit CSV.


What & Why

A 100 % Pythonic, open-source subset of the forensic imaging & carving stack used by FBI/INTERPOL labs.
It does bit-for-bit imaging, deleted-file reconstruction (TSK), raw file-carving (headers/footers), and SHA-256 hashingβ€”without proprietary black boxes.

Layer Public tool we mimic
Imaging dd / ewfacquire
File-system parsing The Sleuth Kit (fls, icat)
Carving PhotoRec signatures
Reporting CSV + SHA-256

Real-World Use-Cases

  1. Corporate IR: recover ransomware-deleted finance spreadsheets.
  2. Law-enforcement: preview USB before sending to expensive lab.
  3. University lab: teach forensic pipeline without € 5 k licenses.
  4. Home user: rescue SD-card wedding photos.
  5. CI/CD security: scan build artifacts for leaked credentials.

Architecture & Data-Flow Diagrams

High-Level Architecture

graph TD
    A([Block Device<br/>/dev/sdb]) -->|1. dd image| B[Forensic Image<br/>SHA-256 hashed]
    B --> C{SleuthKit<br/>fls + icat}
    B --> D[File-Carver<br/>magic bytes]
    C --> E[Deleted Files<br/>report.csv]
    D --> F[Carved Files<br/>/carved/]
    E & F --> G[HTML Report]
Loading

DFD-Level 0 (Context)

graph LR
    Investigator -->|device path| S[System]
    S -->|CSV + files| Investigator
    S -->|log| Evidence_Locker
Loading

DFD-Level 1 (Decomposed)

graph TD
    subgraph "Imaging Module"
        IM1[Read raw bytes] --> IM2[Write img] --> IM3[Hash img]
    end
    subgraph "File-System Module"
        FS1[fls] --> FS2[inode list] --> FS3[icat export]
    end
    subgraph "Carving Module"
        CA1[Scan chunks] --> CA2[Match sig] --> CA3[Write file]
    end
    IM3 --> FS1
    IM3 --> CA1
Loading

Flow Diagram (CLI journey)

flowchart LR
    Start --> ParseArgs{device?}
    ParseArgs -->|yes| Image[Create image] --> Hash[SHA-256]
    Hash --> TSK[TSK deleted] --> Carve[Raw carve] --> Report[Generate report] --> End
Loading

Installation

OS One-liner
Ubuntu / Debian sudo apt install libtsk-dev foremost && pip install -r requirements.txt
macOS brew install sleuthkit foremost && pip install -r requirements.txt
Windows Use WSL2 β†’ Ubuntu instructions above (native build possible but painful)
Detailed steps
git clone https://github.com/alok-kumar8765/Data_Recovery_Using_Python.git
cd Data_Recovery_Using_Python
python -m venv venv && source venv/bin/activate
pip install -U pip wheel
pip install -r requirements.txt
sudo make install-tools   # optional: copies udev rules, man page

Usage Examples

Goal Command
Quick deleted-file scan sudo python recover.py /dev/sdb
Full imaging + carving sudo python recover.py /dev/sdb --image --carve -o case042
Re-scan existing image python recover.py disk.img --carve
Windows (WSL) python recover.py /mnt/e/disk.img --carve

Output tree:

case042/
β”œβ”€β”€ forensic.img
β”œβ”€β”€ forensic.img.sha256
β”œβ”€β”€ report.html
β”œβ”€β”€ sleuthkit/
β”‚   └── sleuthkit.csv
└── carved/
    β”œβ”€β”€ JPG_0000001234.jpg
    └── PDF_0000005678.pdf

Code Walk-Through

File Purpose
recover.py CLI entry-point, orchestrates imaging β†’ TSK β†’ carving
imager.py Stream copy with progress bar & SHA-256
tsk_wrapper.py Sub-process wrapper for fls, icat
carver.py Multi-threaded signature scanner
signatures.py 40+ file headers/footers
reporter.py CSV + HTML report generator
utils.py Hashing, human-readable bytes, etc.

Pros & Cons

Pros Cons
100 % open-source No GUI (CLI only)
Cross-platform Cannot break strong encryption
Extensible signatures No RAID-5/6 rebuild
Chain-of-custody logs Mobile crypto requires extra tools
Free for commercial use SSD TRIM = unrecoverable

Road-Map

  • GTK GUI for non-tech users
  • Distributed GPU brute-force plug-in (Hashcat bridge)
  • RAID-5 mathematic module
  • Android ADB bridge for live extraction
  • DFIR playbook templates (STIX export)

Disclaimer & Legal

This software is provided for lawful use on devices you own or have explicit written permission to examine.
Unauthorised access may violate the Computer Fraud and Abuse Act (US), CMA (UK), or similar laws globally.
The authors accept no liability for misuse or data loss.


🀝 Contributing

PRs welcome! Please run black + flake8 and add a test case under tests/.


πŸ“œ License

MIT Β© 2025 Alok Kumar – see LICENSE file.

About

Open-source forensic imaging & file-carving suite that recovers 40+ formats (JPEG, PNG, PDF, DOCX, MP4, ZIP, WAV, etc.) from raw disks, USB sticks, SD cards or disk-images. Creates bit-for-bit copies, SHA-256/MD5/SHA-1 triple hashes, logs chain-of-custody, exports deleted entries via The Sleuth Kit and performs multi-threaded signature-based .

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages