Connecting to analysis engine...
MULTISTEGANALYSIS USING CNN AND TRANSFORMER

Detect hidden data across media

StegVision is a Final Year Project system for image, audio, and video steganalysis. It combines transformer image inference, CNN audio analysis, spatial bit-plane statistics, JPEG-frequency evidence, residual texture support, and temporal video aggregation in one forensic API.

v3
AUDIO REALWORLD
TFM
IMAGE TRANSFORMER
v4b
VIDEO CALIBRATED
LIVE API FLOW
POST /predict file=evidence.jpg
OK media_type=image
OK transformer + spatial/frequency evidence
OK P(clean), P(stego), reliability
REPORT decision engine, evidence scores, latency
POST /predict file=clip.mp4
OK sample 32-128 frames (adaptive)
OK aggregate mean, P90, max, temporal artifacts
SUPPORTED MEDIA

One API, Three Pipelines

The frontend sends every upload to the same Flask endpoint. The API chooses the right CNN, transformer, or temporal forensic path from the file extension.

Images

A transformer image backend scores the file, then spatial LSB, JPEG-frequency, and residual texture engines add independent forensic evidence.

JPGPNGWEBPBMP

Audio

The audio branch uses a CNN over mel, PCM-LSB, and residual feature tensors, then runs SPA/RS forensics, codec profiling, and calibrated CNN fusion before final scoring.

WAVMP3FLACOGG

Video

OpenCV samples up to 128 frames adaptively. Each frame is scored by the image evidence ensemble, then mean, P90, maximum, support, and temporal artifacts are fused.

MP4AVIMOVMKV
PROJECT TEAM

Cybersecurity FYP 2025

Developed by Aroob Mukhtar, Muhammad Madni, and Umar Daraz under the supervision of Dr. Farhan Hassan.

MEET THE TEAM
STACK

CNN-Transformer Evidence Stack

The deployment is built around the actual inference path used by the website: transformer image analysis, CNN audio analysis, classical forensic evidence, and a JSON report that explains the decision.

Backend

Flask, ONNX Runtime, PyTorch audio inference, OpenCV frame extraction, and CORS for the static website.

Models

Stegformer transformer ONNX for image/video evidence, AudioStegNet CNN for audio, and deterministic forensic support engines.

Frontend

Static HTML, CSS, and JavaScript report confidence, evidence scores, reliability, frame charts, LSB visualisation, and downloadable JSON.