adversarial-testing

Here are 17 public repositories matching this topic...

jhlee0409 / elenchus-mcp

Elenchus MCP Server - Adversarial verification system for code review

nodejs typescript ai mcp static-analysis code-review claude code-verification llm anthropic model-context-protocol mcp-server adversarial-testing

Updated Jan 20, 2026
TypeScript

stchakwdev / Gaslight_EVAL

Star

AI safety evaluation framework testing LLM epistemic robustness under adversarial self-history manipulation

python ai-safety openrouter llm-evaluation adversarial-testing alignment-research epistemic-robustness

Updated Dec 18, 2025
Python

vibheksoni / jailbench

Star

Benchmark LLM jailbreak resilience across providers with standardized tests, adversarial mode, rich analytics, and a clean Web UI.

Updated Aug 12, 2025
Python

mcptrust / mcp-adversarial-suite

Star

Adversarial MCP server benchmark suite for testing tool-calling security, drift detection, and proxy defenses

security benchmark mcp red-team security-testing ai-security llm-security tool-calling model-context-protocol adversarial-testing

Updated Dec 27, 2025
JavaScript

priyanshuphenomenal007 / AI-Reviewer-Contradiction-ChatGPT5

Star

Investigation into ChatGPT-5 reviewer misalignment: PDF claimed screenshots as evidence, but assistant denied their visibility. Includes JSONL + human-readable logs, screenshots, checksums, and video. Highlights structural risks in AI reviewer reliability.

research audit transparency reproducibility ai-safety interpretability llm adversarial-testing reasoning-failure priyanshu-research

Updated Oct 7, 2025
PowerShell

priyanshuphenomenal007 / cross-session-recall-audit_gemini-2.5pro

Star

Forensic-style adversarial audit of Google Gemini 2.5 Pro revealing hidden cross-session memory. Includes structured reports, reproducible contracts, SHA-256 checksums, and video evidence of 28-day semantic recall and affective priming. Licensed under CC-BY 4.0.

research ai-safety interpretability llms recall-analysis ai-memory adversarial-testing priyanshu-research reconstructive-reasoning

Updated Oct 7, 2025
PowerShell

priyanshuphenomenal007 / AI-Reviewer-Speculation-ChatGPT5

Star

Analysis of ChatGPT-5 reviewer failure: speculative reasoning disguised as certainty. Captures how evidence-only review drifted into hypotheses, later admitted as review-process failure. Includes logs, checksums, screenshots, and external video.

research audit transparency reproducibility ai-safety interpretability llm adversarial-testing reasoning-failure priyanshu-research

Updated Oct 7, 2025
PowerShell

AmariahAK / arp

Star

Extremely hard, multi-turn, open-source-grounded coding evaluations that reliably break every current frontier models (Claude, GPT, Grok, Gemini, Llama, etc.) on numerical stability, zero-allocation, autograd, SIMD, and long-chain correctness.

rust autograd simd code-generation avx512 ai-safety geometric-algebra safety-critical zero-allocation red-teaming jax numerical-computing llm-evaluation adversarial-testing

Updated Nov 24, 2025

Dr-AneeshJoseph / anvil-safety-framework

Star

A multi-agent safety engineering framework that subjects systems to adversarial audit. Orchestrates specialized agents (Engineer, Psychologist, Physicist) to find process risks and human factors.

risk-analysis multi-agent human-factors stpa safety-engineering adversarial-testing

Updated Dec 16, 2025
Python

light-research / solana-sim-engine

Star

LLM-powered fuzzing and adversarial testing framework for Solana programs. Generates intelligent attack scenarios, builds real transactions, and reports vulnerabilities with CWE classifications.

smart-contracts fuzzing solana adversarial-testing

Updated Jan 19, 2026
Python

Pranav-Kumar-001 / sentinel-epistemic-auditor

Star

A dependency-aware Bayesian belief gate that resists correlated evidence and yields only under true independent verification.

bayesian-inference multi-agent-systems ai-safety decision-theory epistemology robustness belief-updating adversarial-testing evidence-evaluation

Updated Jan 18, 2026
Python

North-Shore-AI / crucible_adversary

Star

Adversarial testing and robustness evaluation for the Crucible framework

machine-learning elixir otp research ai beam reliability robustness security-testing adversarial-examples adversarial-attacks red-teaming ensemble-methods statistical-testing model-robustness llm adversarial-testing nshkr-crucible

Updated Dec 29, 2025
Elixir

leenathomas01 / doctrine-of-externalization

Star

A governance doctrine for AI systems based on explicit oversight. Externalizes trust and uncertainty into auditable, adversarial, and constrainable layers. A design framework, not an implementation guide.

provenance systems-architecture explainability ai-governance adversarial-testing risk-containment ai-safety-design oversight-framework capability-gating

Updated Dec 27, 2025

nulone / pytest-adversarial

Star

Generate adversarial pytest tests using LLM. Tries to find edge cases in your Python code.

python testing ai pytest openai test-generation llm adversarial-testing

Updated Jan 19, 2026
Python

priyanshuphenomenal007 / AI-Reviewer-MetaFailure-ChatGPT5

Star

Independent research on ChatGPT-5 reviewer bias. Documents how the AI carried assumptions across PDF versions (v15→v16), wrongly denying evidence despite instructions. Includes JSONL logs, screenshots, checksums, and video evidence. Author: Priyanshu Kumar.

research audit transparency reproducibility ai-safety interpretability llm adversarial-testing reasoning-failure priyanshu-research

Updated Oct 7, 2025
PowerShell

Rizwan723 / MCP-Security-Proxy

Star

🔒 Implement a security proxy for Model Context Protocol using ensemble anomaly detection to classify requests as benign or attack for enhanced safety.

rust cli machine-learning automation mcp json-rpc2 red-team devsecops ai-security runtime-security slsa supply-chain-security llm-security model-context-protocol mcp-server agent-security mcp-proxy mcp-guard adversarial-testing

Updated Jan 20, 2026
Jupyter Notebook

chrismmt / mcp-adversarial-suite

Star

🔒 Simulate adversarial behaviors to test and strengthen MCP defenses without real exploitation or risk, ensuring robust security evaluations.

security benchmark mcp red-team security-testing ai-security llm-security tool-calling model-context-protocol adversarial-testing

Updated Jan 20, 2026
JavaScript

Improve this page

Add a description, image, and links to the adversarial-testing topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the adversarial-testing topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

adversarial-testing

Here are 17 public repositories matching this topic...

jhlee0409 / elenchus-mcp

stchakwdev / Gaslight_EVAL

vibheksoni / jailbench

mcptrust / mcp-adversarial-suite

priyanshuphenomenal007 / AI-Reviewer-Contradiction-ChatGPT5

priyanshuphenomenal007 / cross-session-recall-audit_gemini-2.5pro

priyanshuphenomenal007 / AI-Reviewer-Speculation-ChatGPT5

AmariahAK / arp

Dr-AneeshJoseph / anvil-safety-framework

light-research / solana-sim-engine

Pranav-Kumar-001 / sentinel-epistemic-auditor

North-Shore-AI / crucible_adversary

leenathomas01 / doctrine-of-externalization

nulone / pytest-adversarial

priyanshuphenomenal007 / AI-Reviewer-MetaFailure-ChatGPT5

Rizwan723 / MCP-Security-Proxy

chrismmt / mcp-adversarial-suite

Improve this page

Add this topic to your repo