diff --git a/.env.example b/.env.example
index ed36af591..60c3ec58a 100644
--- a/.env.example
+++ b/.env.example
@@ -2,20 +2,15 @@
 
 # This would be set to the production domain with an env var on deployment
 
-# used by Traefik to transmit traffic and aqcuire TLS certificates
-
 DOMAIN=localhost
 
-# To test the local Traefik config
-
-# DOMAIN=localhost.tiangolo.com
-
 # Environment: "development", "testing", "staging", "production"
 
 ENVIRONMENT=development
 
 PROJECT_NAME="Kaapi"
 STACK_NAME=Kaapi
+API_VERSION=0.5.0
 
 #Backend
 SECRET_KEY=changethis
diff --git a/.github/ISSUE_TEMPLATE/enhancement_request.md b/.github/ISSUE_TEMPLATE/enhancement_request.md
new file mode 100644
index 000000000..2b120f908
--- /dev/null
+++ b/.github/ISSUE_TEMPLATE/enhancement_request.md
@@ -0,0 +1,20 @@
+---
+name: Enhancement request
+about: Suggest an improvement to existing codebase
+title: ''
+labels: ''
+assignees: ''
+
+---
+
+**Describe the current behavior**
+A clear description of how it currently works and what the limitations are.
+
+**Describe the enhancement you'd like**
+A clear and concise description of the improvement you want to see.
+
+**Why is this enhancement needed?**
+Explain the benefits (e.g., performance, usability, maintainability, scalability).
+
+**Additional context**
+Add any other context, metrics, screenshots, or examples about the enhancement here.
diff --git a/.github/workflows/cd-production.yml b/.github/workflows/cd-production.yml
index 3a1d0edfe..ee2f965c7 100644
--- a/.github/workflows/cd-production.yml
+++ b/.github/workflows/cd-production.yml
@@ -1,4 +1,4 @@
-name: Deploy AI Platform to ECS Production
+name: Deploy Kaapi to ECS Production
 
 on:
   push:
diff --git a/.github/workflows/cd-staging.yml b/.github/workflows/cd-staging.yml
index 10d9ccde9..44743d660 100644
--- a/.github/workflows/cd-staging.yml
+++ b/.github/workflows/cd-staging.yml
@@ -1,4 +1,4 @@
-name: Deploy AI Platform to ECS
+name: Deploy Kaapi to ECS
 
 on:
   push:
diff --git a/.github/workflows/continuous_integration.yml b/.github/workflows/continuous_integration.yml
index fb2d9c80d..dac6f791e 100644
--- a/.github/workflows/continuous_integration.yml
+++ b/.github/workflows/continuous_integration.yml
@@ -1,4 +1,4 @@
-name: AI Platform CI
+name: Kaapi CI
 
 on:
   push:
@@ -22,7 +22,7 @@ jobs:
 
     strategy:
       matrix:
-        python-version: ["3.11.7"]
+        python-version: ["3.12"]
         redis-version: [6]
 
     steps:
diff --git a/CLAUDE.md b/CLAUDE.md
index 615f19792..53d09c7e9 100644
--- a/CLAUDE.md
+++ b/CLAUDE.md
@@ -20,8 +20,8 @@ fastapi run --reload app/main.py
 # Run pre-commit hooks
 uv run pre-commit run --all-files
 
-# Generate database migration
-alembic revision --autogenerate -m 'Description'
+# Generate database migration (rev-id should be latest existing revision ID + 1)
+alembic revision --autogenerate -m "Description" --rev-id 040
 
 # Seed database with test data
 uv run python -m app.seed_data.seed_data
@@ -90,3 +90,50 @@ The application uses different environment files:
 
 - Python 3.11+ with type hints
 - Pre-commit hooks for linting and formatting
+
+## Coding Conventions
+
+### Type Hints
+
+Always add type hints to all function parameters and return values.
+
+### Logging Format
+
+Prefix all log messages with the function name in square brackets.
+
+```python
+logger.info(f"[function_name] Message {mask_string(sensitive_value)}")
+```
+
+### Database Column Comments
+
+Use sa_column_kwargs["comment"] to describe database columns, especially when the purpose isn’t obvious. This helps non-developers understand column purposes directly from the database schema:
+
+```python
+field_name: int = Field(
+    foreign_key="table.id",
+    nullable=False,
+    ondelete="CASCADE",
+    sa_column_kwargs={"comment": "What this column represents"}
+)
+```
+
+Prioritize comments for:
+- Columns with non-obvious purposes
+- Status/type fields (document valid values)
+- JSON/metadata columns (describe expected structure)
+- Foreign keys (clarify the relationship)
+
+### Endpoint Documentation
+
+Load Swagger descriptions from external markdown files instead of inline strings:
+
+```python
+@router.post(
+    "/endpoint",
+    description=load_description("domain/action.md"),
+    response_model=APIResponse[ResponseModel],
+)
+```
+
+Store documentation files in `backend/app/api/docs/<domain>/<action>.md`
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
index 31515b205..26e535cb6 100644
--- a/CONTRIBUTING.md
+++ b/CONTRIBUTING.md
@@ -1,13 +1,13 @@
-# Contributing to Tech4Dev AI Platform
+# Contributing to Kaapi
 
-Thank you for considering contributing to **Tech4Dev AI Platform**! We welcome contributions of all kinds, including bug reports, feature requests, documentation improvements, and code contributions.
+Thank you for considering contributing to **Kaapi**! We welcome contributions of all kinds, including bug reports, feature requests, documentation improvements, and code contributions.
 
 ---
 
 ## 📌 Getting Started
 To contribute successfully, you must first set up the project on your local machine. Please follow the instructions outlined in the project's README to configure the repository and begin your contributions.
 
-Before you proceed, **make sure to check the repository's [README](https://github.com/ProjectTech4DevAI/ai-platform/blob/main/backend/README.md) for a comprehensive overview of the project's backend and detailed setup guidelines.**
+Before you proceed, **make sure to check the repository's [README](https://github.com/ProjectTech4DevAI/kaapi-backend/blob/main/backend/README.md) for a comprehensive overview of the project's backend and detailed setup guidelines.**
 
 ---
 
@@ -17,10 +17,18 @@ Before you proceed, **make sure to check the repository's [README](https://githu
 1. Click the **Fork** button on the top right of this repository.
 2. Clone your forked repository:
 ```
-git clone https://github.com/{username}/ai-platform.git
-cd ai-platform
+git clone https://github.com/{username}/kaapi-backend.git
+cd kaapi-backend
 ```
 
+### Check for Existing Issues
+Before you start working on a contribution:
+1. **Check if an issue exists** for the bug or feature you want to work on in the [Issues](https://github.com/ProjectTech4DevAI/kaapi-backend/issues) section.
+2. **If no issue exists**, create one first using the templates present:
+   - For bugs: Use the bug report template
+   - For enhancements: Use the enhancement request template
+   - For features: Create a feature request issue
+
 ### Create a Branch
 • Always work in a new branch based on main.
 • For branch name, follow this convention: ``type/one-line-description``
@@ -30,12 +38,9 @@ cd ai-platform
    - enhancement
    - bugfix
    - feature
-
-
  ```
  git checkout -b type/one-line-description
  ```
-
 ### Make and Test Changes
 1. Adhere to the project's established coding style for consistency.
 2. Make sure the code adheres to best practices.
@@ -60,7 +65,7 @@ git commit -m "one liner for the commit"
 • For PR name, follow this convention:
    ``Module Name: One liner of changes``
 
-• Don't forget to link the PR to the issue if you are solving one.
+• Don't forget to link the PR to the issue.
 
 • Push your changes to GitHub:
 ```
diff --git a/README.md b/README.md
index 316d490d8..8bb525c46 100644
--- a/README.md
+++ b/README.md
@@ -1,11 +1,11 @@
-# AI Platform
+# Kaapi
 
 [![License: AGPL v3](https://img.shields.io/badge/License-AGPL%20v3-blue.svg)](https://www.gnu.org/licenses/agpl-3.0)
-![](https://github.com/ProjectTech4DevAI/ai-platform/workflows/Continuous%20Integration/badge.svg)
-[![Code coverage badge](https://img.shields.io/codecov/c/github/ProjectTech4DevAI/ai-platform/staging.svg)](https://codecov.io/gh/ProjectTech4DevAI/ai-platform/branch/staging)
-![GitHub issues](https://img.shields.io/github/issues-raw/ProjectTech4DevAI/ai-platform)
-[![codebeat badge](https://codebeat.co/badges/dd951390-5f51-4c98-bddc-0b618bdb43fd)](https://codebeat.co/projects/github-com-ProjectTech4DevAI/ai-platform-staging)
-[![Commits](https://img.shields.io/github/commit-activity/m/ProjectTech4DevAI/ai-platform)](https://img.shields.io/github/commit-activity/m/ProjectTech4DevAI/ai-platform)
+![](https://github.com/ProjectTech4DevAI/kaapi-backend/workflows/Continuous%20Integration/badge.svg)
+[![Code coverage badge](https://img.shields.io/codecov/c/github/ProjectTech4DevAI/kaapi-backend/staging.svg)](https://codecov.io/gh/ProjectTech4DevAI/kaapi-backend/branch/staging)
+![GitHub issues](https://img.shields.io/github/issues-raw/ProjectTech4DevAI/kaapi-backend)
+[![codebeat badge](https://codebeat.co/badges/dd951390-5f51-4c98-bddc-0b618bdb43fd)](https://codebeat.co/projects/github-com-ProjectTech4DevAI/kaapi-backend-staging)
+[![Commits](https://img.shields.io/github/commit-activity/m/ProjectTech4DevAI/kaapi-backend)](https://img.shields.io/github/commit-activity/m/ProjectTech4DevAI/kaapi-backend)
 
 ## Pre-requisites
 
diff --git a/backend/README.md b/backend/README.md
index 0bdbd8c75..27671dafe 100644
--- a/backend/README.md
+++ b/backend/README.md
@@ -1,4 +1,4 @@
-# FastAPI Project - Backend
+# Kaapi - Backend
 
 ## Requirements
 
@@ -27,13 +27,15 @@ $ source .venv/bin/activate
 
 Make sure your editor is using the correct Python virtual environment, with the interpreter at `backend/.venv/bin/python`.
 
-Modify or add SQLModel models for data and SQL tables in `./backend/app/models.py`, API endpoints in `./backend/app/api/`, CRUD (Create, Read, Update, Delete) utils in `./backend/app/crud.py`.
+Modify or add SQLModel models for data and SQL tables in `./backend/app/models/`, API endpoints in `./backend/app/api/`, CRUD (Create, Read, Update, Delete) utils in `./backend/app/crud/`.
 
-## VS Code
+## Seed Database (Optional)
 
-There are already configurations in place to run the backend through the VS Code debugger, so that you can use breakpoints, pause and explore variables, etc.
+For local development, seed the database with baseline data:
 
-The setup is also already configured so you can run the tests through the VS Code Python tests tab.
+```console
+$ uv run python -m app.seed_data.seed_data
+```
 
 ## Docker Compose Override
 
@@ -91,6 +93,22 @@ Nevertheless, if it doesn't detect a change but a syntax error, it will just sto
 
 ## Backend tests
 
+### Setup Test Environment
+
+Before running tests, create a test environment configuration:
+
+1. Copy the test environment template:
+```console
+$ cp .env.test.example .env.test
+```
+
+2. Update `.env.test` with test-specific settings:
+   - Set `ENVIRONMENT=testing`
+   - Configure a separate test database (recommended). Using a separate
+     database prevents tests from affecting your development data.
+
+### Run Tests
+
 To test the backend run:
 
 ```console
@@ -133,7 +151,7 @@ Make sure you create a "revision" of your models and that you "upgrade" your dat
 $ docker compose exec backend bash
 ```
 
-* Alembic is already configured to import your SQLModel models from `./backend/app/models.py`.
+* Alembic is already configured to import your SQLModel models from `./backend/app/models/`.
 
 * After changing a model (for example, adding a column), inside the container, create a revision, e.g.:
 
diff --git a/backend/app/api/deps.py b/backend/app/api/deps.py
index 73cb77427..9f2c81a62 100644
--- a/backend/app/api/deps.py
+++ b/backend/app/api/deps.py
@@ -17,8 +17,6 @@
     AuthContext,
     TokenPayload,
     User,
-    UserOrganization,
-    UserProjectOrg,
 )
 
 
@@ -37,113 +35,6 @@ def get_db() -> Generator[Session, None, None]:
 TokenDep = Annotated[str, Depends(reusable_oauth2)]
 
 
-def get_current_user(
-    session: SessionDep,
-    token: TokenDep,
-    api_key: Annotated[str, Depends(api_key_header)],
-) -> User:
-    """Authenticate user via API Key first, fallback to JWT token. Returns only User."""
-
-    if api_key:
-        api_key_record = api_key_manager.verify(session, api_key)
-        if not api_key_record:
-            raise HTTPException(status_code=401, detail="Invalid API Key")
-
-        if not api_key_record.user.is_active:
-            raise HTTPException(status_code=403, detail="Inactive user")
-
-        return api_key_record.user  # Return only User object
-
-    elif token:
-        try:
-            payload = jwt.decode(
-                token, settings.SECRET_KEY, algorithms=[security.ALGORITHM]
-            )
-            token_data = TokenPayload(**payload)
-        except (InvalidTokenError, ValidationError):
-            raise HTTPException(
-                status_code=status.HTTP_403_FORBIDDEN,
-                detail="Could not validate credentials",
-            )
-
-        user = session.get(User, token_data.sub)
-        if not user:
-            raise HTTPException(status_code=404, detail="User not found")
-        if not user.is_active:
-            raise HTTPException(status_code=403, detail="Inactive user")
-
-        return user  # Return only User object
-
-    raise HTTPException(status_code=401, detail="Invalid Authorization format")
-
-
-CurrentUser = Annotated[User, Depends(get_current_user)]
-
-
-def get_current_user_org(
-    current_user: CurrentUser, session: SessionDep, request: Request
-) -> UserOrganization:
-    """Extend `User` with organization_id if available, otherwise return UserOrganization without it."""
-
-    organization_id = None
-    api_key = request.headers.get("X-API-KEY")
-    if api_key:
-        api_key_record = api_key_manager.verify(session, api_key)
-        if api_key_record:
-            validate_organization(session, api_key_record.organization.id)
-            organization_id = api_key_record.organization.id
-
-    return UserOrganization(
-        **current_user.model_dump(), organization_id=organization_id
-    )
-
-
-CurrentUserOrg = Annotated[UserOrganization, Depends(get_current_user_org)]
-
-
-def get_current_user_org_project(
-    current_user: CurrentUser, session: SessionDep, request: Request
-) -> UserProjectOrg:
-    api_key = request.headers.get("X-API-KEY")
-    organization_id = None
-    project_id = None
-
-    if api_key:
-        api_key_record = api_key_manager.verify(session, api_key)
-        if api_key_record:
-            validate_organization(session, api_key_record.organization.id)
-            organization_id = api_key_record.organization.id
-            project_id = api_key_record.project.id
-
-    else:
-        raise HTTPException(status_code=401, detail="Invalid API Key")
-
-    return UserProjectOrg(
-        **current_user.model_dump(),
-        organization_id=organization_id,
-        project_id=project_id,
-    )
-
-
-CurrentUserOrgProject = Annotated[UserProjectOrg, Depends(get_current_user_org_project)]
-
-
-def get_current_active_superuser(current_user: CurrentUser) -> User:
-    if not current_user.is_superuser:
-        raise HTTPException(
-            status_code=403, detail="The user doesn't have enough privileges"
-        )
-    return current_user
-
-
-def get_current_active_superuser_org(current_user: CurrentUserOrg) -> User:
-    if not current_user.is_superuser:
-        raise HTTPException(
-            status_code=403, detail="The user doesn't have enough privileges"
-        )
-    return current_user
-
-
 def get_auth_context(
     session: SessionDep,
     token: TokenDep,
diff --git a/backend/app/api/docs/api_keys/create.md b/backend/app/api/docs/api_keys/create.md
new file mode 100644
index 000000000..2c2317b96
--- /dev/null
+++ b/backend/app/api/docs/api_keys/create.md
@@ -0,0 +1,3 @@
+Create a new API key for programmatic access to the platform.
+
+The raw API key is returned **only once during creation**. Store it securely as it cannot be retrieved again. Only the key prefix will be visible in subsequent requests for security reasons.
diff --git a/backend/app/api/docs/api_keys/delete.md b/backend/app/api/docs/api_keys/delete.md
new file mode 100644
index 000000000..3b87b398c
--- /dev/null
+++ b/backend/app/api/docs/api_keys/delete.md
@@ -0,0 +1,3 @@
+Delete an API key by its ID.
+
+Permanently revokes the API key. Any requests using this key will fail immediately after deletion.
diff --git a/backend/app/api/docs/api_keys/list.md b/backend/app/api/docs/api_keys/list.md
new file mode 100644
index 000000000..a4678e6f0
--- /dev/null
+++ b/backend/app/api/docs/api_keys/list.md
@@ -0,0 +1,3 @@
+List all API keys for the current project.
+
+Returns a paginated list of API keys with key prefix for security. The full key is only shown during creation and cannot be retrieved afterward.
diff --git a/backend/app/api/docs/collections/create.md b/backend/app/api/docs/collections/create.md
index aef52efaa..c3a5f4400 100644
--- a/backend/app/api/docs/collections/create.md
+++ b/backend/app/api/docs/collections/create.md
@@ -1,13 +1,9 @@
 Setup and configure the document store that is pertinent to the RAG
 pipeline:
 
-* Make OpenAI
-  [File](https://platform.openai.com/docs/api-reference/files)'s from
-  documents stored in the cloud (see the `documents` interface).
-* Create an OpenAI [Vector
-  Store](https://platform.openai.com/docs/api-reference/vector-stores)
-  based on those file(s).
-* [To be deprecated] Attach the Vector Store to an OpenAI
+* Create a vector store from the document IDs you received after uploading your
+  documents through the Documents module.
+* [Deprecated] Attach the Vector Store to an OpenAI
   [Assistant](https://platform.openai.com/docs/api-reference/assistants). Use
   parameters in the request body relevant to an Assistant to flesh out
   its configuration. Note that an assistant will only be created when you pass both
diff --git a/backend/app/api/docs/collections/delete.md b/backend/app/api/docs/collections/delete.md
index c7f0f2c7a..8cb213d51 100644
--- a/backend/app/api/docs/collections/delete.md
+++ b/backend/app/api/docs/collections/delete.md
@@ -1,6 +1,6 @@
 Remove a collection from the platform. This is a two step process:
 
-1. Delete all OpenAI resources that were allocated: file(s), the Vector
+1. Delete all resources that were allocated: file(s), the Vector
    Store, and the Assistant.
 2. Delete the collection entry from the kaapi database.
 
diff --git a/backend/app/api/docs/collections/info.md b/backend/app/api/docs/collections/info.md
index ad862ac52..576046bd0 100644
--- a/backend/app/api/docs/collections/info.md
+++ b/backend/app/api/docs/collections/info.md
@@ -1,4 +1,4 @@
 Retrieve detailed information about a specific collection by its collection id. This endpoint returns the collection object including its project, organization,
 timestamps, and associated LLM service details (`llm_service_id` and `llm_service_name`).
 
-Additionally, if the `include_docs` flag in the request body is true then you will get a list of document IDs associated with a given collection as well. Note that, documents returned are not only stored by the AI platform, but also by OpenAI.
+Additionally, if the `include_docs` flag in the request body is true then you will get a list of document IDs associated with a given collection as well. Note that, documents returned are not only stored by Kaapi, but also by Vector store provider.
diff --git a/backend/app/api/docs/collections/job_info.md b/backend/app/api/docs/collections/job_info.md
index ef5589c2c..8ddbf0694 100644
--- a/backend/app/api/docs/collections/job_info.md
+++ b/backend/app/api/docs/collections/job_info.md
@@ -1,4 +1,4 @@
-Retrieve information about a collection job by the collection job ID. This endpoint provides detailed status and metadata for a specific collection job in the AI platform. It is especially useful for:
+Retrieve information about a collection job by the collection job ID. This endpoint provides detailed status and metadata for a specific collection job in Kaapi. It is especially useful for:
 
 * Fetching the collection job object, including the collection job ID, the current status, and the associated collection details.
 
diff --git a/backend/app/api/docs/collections/list.md b/backend/app/api/docs/collections/list.md
index cabcd7c61..bb28e0b6a 100644
--- a/backend/app/api/docs/collections/list.md
+++ b/backend/app/api/docs/collections/list.md
@@ -1,6 +1,5 @@
-List _active_ collections -- collections that have been created but
-not deleted
+List all _active_ collections that have been created and are not deleted
 
-If a vector store was created - `llm_service_name` and `llm_service_id` in the response denote the name of the vector store (eg. 'openai vector store') and its id.
+If a vector store was created - `llm_service_name` and `llm_service_id` in the response denotes the name of the vector store (eg. 'openai vector store') and its id respectively.
 
-[To be deprecated] If an assistant was created, `llm_service_name` and `llm_service_id` in the response denotes the name of the model used in the assistant (eg. 'gpt-4o') and assistant id.
+[Deprecated] If an assistant was created, `llm_service_name` and `llm_service_id` in the response denotes the name of the model used in the assistant (eg. 'gpt-4o') and assistant id.
diff --git a/backend/app/api/docs/config/create.md b/backend/app/api/docs/config/create.md
index d3c8ff15e..fd193024a 100644
--- a/backend/app/api/docs/config/create.md
+++ b/backend/app/api/docs/config/create.md
@@ -11,7 +11,7 @@ Configurations allow you to store and manage reusable LLM parameters
 * Provider-agnostic storage - params are passed through to the provider as-is
 
 
-**Example for the config blob: OpenAI Responses API with File Search**
+**Example for the config blob: OpenAI Responses API with File Search -**
 
 ```json
 "config_blob": {
diff --git a/backend/app/api/docs/credentials/create.md b/backend/app/api/docs/credentials/create.md
new file mode 100644
index 000000000..139a3c85c
--- /dev/null
+++ b/backend/app/api/docs/credentials/create.md
@@ -0,0 +1,3 @@
+Persist new credentials for the current organization and project.
+
+Credentials are encrypted and stored securely for provider integrations (OpenAI, Langfuse, etc.). Only one credential per provider is allowed per organization-project combination.
diff --git a/backend/app/api/docs/credentials/delete_all.md b/backend/app/api/docs/credentials/delete_all.md
new file mode 100644
index 000000000..c0eaa8a54
--- /dev/null
+++ b/backend/app/api/docs/credentials/delete_all.md
@@ -0,0 +1,3 @@
+Delete all credentials for current organization and project.
+
+Permanently removes all provider credentials from the current organization and project.
diff --git a/backend/app/api/docs/credentials/delete_provider.md b/backend/app/api/docs/credentials/delete_provider.md
new file mode 100644
index 000000000..fca18ea6b
--- /dev/null
+++ b/backend/app/api/docs/credentials/delete_provider.md
@@ -0,0 +1,3 @@
+Delete credentials for a specific provider.
+
+Permanently removes credentials for a specific provider from the current organization and project.
diff --git a/backend/app/api/docs/credentials/get_provider.md b/backend/app/api/docs/credentials/get_provider.md
new file mode 100644
index 000000000..2f3a76920
--- /dev/null
+++ b/backend/app/api/docs/credentials/get_provider.md
@@ -0,0 +1,3 @@
+Get credentials for a specific provider.
+
+Retrieves decrypted credentials for a specific provider (e.g., `openai`, `langfuse`) for the current organization and project.
diff --git a/backend/app/api/docs/credentials/list.md b/backend/app/api/docs/credentials/list.md
new file mode 100644
index 000000000..c660229bc
--- /dev/null
+++ b/backend/app/api/docs/credentials/list.md
@@ -0,0 +1,3 @@
+Get all credentials for current organization and project.
+
+Returns list of all provider credentials associated with your organization and project.
diff --git a/backend/app/api/docs/credentials/update.md b/backend/app/api/docs/credentials/update.md
new file mode 100644
index 000000000..0377f0e4b
--- /dev/null
+++ b/backend/app/api/docs/credentials/update.md
@@ -0,0 +1,3 @@
+Update credentials for a specific provider.
+
+Updates existing provider credentials for the current organization and project. Provider and credential fields must be provided.
diff --git a/backend/app/api/docs/documents/delete.md b/backend/app/api/docs/documents/delete.md
index e62c95ad9..ff7af99b6 100644
--- a/backend/app/api/docs/documents/delete.md
+++ b/backend/app/api/docs/documents/delete.md
@@ -1,4 +1,4 @@
-Perform a soft delete of the document. A soft delete makes the
+Perform a delete of the document. This makes the
 document invisible. It does not delete the document from cloud storage
 or its information from the database.
 
diff --git a/backend/app/api/docs/documents/info.md b/backend/app/api/docs/documents/info.md
index 527a2308d..c9b4b04dd 100644
--- a/backend/app/api/docs/documents/info.md
+++ b/backend/app/api/docs/documents/info.md
@@ -1 +1,3 @@
-Retrieve all information about a given document. If you set the ``include_url`` parameter to true, a signed URL will be included in the response, which is a clickable link to access the retrieved document. If you don't set it to true, the URL will not be included in the response.
+Retrieve all information about a given document.
+
+If you set the ``include_url`` parameter to true, a signed URL will be included in the response, which is a clickable link to access the retrieved document. If you don't set it to true, the URL will not be included in the response.
diff --git a/backend/app/api/docs/documents/job_info.md b/backend/app/api/docs/documents/job_info.md
index c70e42bfa..387623bd4 100644
--- a/backend/app/api/docs/documents/job_info.md
+++ b/backend/app/api/docs/documents/job_info.md
@@ -1 +1,3 @@
-Get the status and details of a document transformation job. If you set the ``include_url`` parameter to true, a signed URL will be included in the response, which is a clickable link to access the transformed document if the job has been successful. If you don't set it to true, the URL will not be included in the response.
+Get the status and details of a document transformation job.
+
+If you set the ``include_url`` parameter to true, a signed URL will be included in the response, which is a clickable link to access the transformed document if the job has been successful. If you don't set it to true, the URL will not be included in the response.
diff --git a/backend/app/api/docs/documents/job_list.md b/backend/app/api/docs/documents/job_list.md
index f85ca99ad..1b0a1e44a 100644
--- a/backend/app/api/docs/documents/job_list.md
+++ b/backend/app/api/docs/documents/job_list.md
@@ -1 +1,3 @@
-Get the status and details of multiple document transformation jobs by IDs. If you set the ``include_url`` parameter to true, a signed URL will be included in the response, which is a clickable link to access the transformed document for successful jobs. If you don't set it to true, the URL will not be included in the response.
+Get the status and details of multiple document transformation jobs by IDs.
+
+If you set the ``include_url`` parameter to true, a signed URL will be included in the response, which is a clickable link to access the transformed document for successful jobs. If you don't set it to true, the URL will not be included in the response.
diff --git a/backend/app/api/docs/documents/list.md b/backend/app/api/docs/documents/list.md
index 110e931c9..b11d47866 100644
--- a/backend/app/api/docs/documents/list.md
+++ b/backend/app/api/docs/documents/list.md
@@ -1 +1,3 @@
-List documents uploaded to the AI platform.  If you set the ``include_url`` parameter to true, a signed URL will be included in the response, which is a clickable link to access the retrieved documents. If you don't set it to true, the URL will not be included in the response.
+List documents uploaded to Kaapi.
+
+If you set the ``include_url`` parameter to true, a signed URL will be included in the response, which is a clickable link to access the retrieved documents. If you don't set it to true, the URL will not be included in the response.
diff --git a/backend/app/api/docs/documents/permanent_delete.md b/backend/app/api/docs/documents/permanent_delete.md
index b179b1fe7..2a6479803 100644
--- a/backend/app/api/docs/documents/permanent_delete.md
+++ b/backend/app/api/docs/documents/permanent_delete.md
@@ -1,6 +1,7 @@
 This operation marks the document as deleted in the database while retaining its metadata. However, the actual file is
 permanently deleted from cloud storage (e.g., S3) and cannot be recovered. Only the database record remains for reference
 purposes.
+
 If the document is part of an active collection, those collections
 will be deleted using the collections delete interface. Noteably, this
 means all OpenAI Vector Store's and Assistant's to which this document
diff --git a/backend/app/api/docs/documents/upload.md b/backend/app/api/docs/documents/upload.md
index cc4ad9bf5..e667015f5 100644
--- a/backend/app/api/docs/documents/upload.md
+++ b/backend/app/api/docs/documents/upload.md
@@ -1,4 +1,4 @@
-Upload a document to the AI platform.
+Upload a document to Kaapi.
 
 - If only a file is provided, the document will be uploaded and stored, and its ID will be returned.
 - If a target format is specified, a transformation job will also be created to transform document into target format in the background. The response will include both the uploaded document details and information about the transformation job.
diff --git a/backend/app/api/docs/evaluation/create_evaluation.md b/backend/app/api/docs/evaluation/create_evaluation.md
index 313ad0079..b0c2ba236 100644
--- a/backend/app/api/docs/evaluation/create_evaluation.md
+++ b/backend/app/api/docs/evaluation/create_evaluation.md
@@ -1,80 +1,46 @@
-Start an evaluation using OpenAI Batch API.
+Start an evaluation run using the OpenAI Batch API.
 
-This endpoint:
-1. Fetches the dataset from database and validates it has Langfuse dataset ID
-2. Creates an EvaluationRun record in the database
-3. Fetches dataset items from Langfuse
-4. Builds JSONL for batch processing (config is used as-is)
-5. Creates a batch job via the generic batch infrastructure
-6. Returns the evaluation run details with batch_job_id
+Evaluations allow you to systematically test LLM configurations against
+predefined datasets with automatic progress tracking and result collection.
 
-The batch will be processed asynchronously by Celery Beat (every 60s).
-Use GET /evaluations/{evaluation_id} to check progress.
+**Key Features:**
+* Fetches dataset items from Langfuse and creates batch processing job via OpenAI Batch API
+* Asynchronous processing with automatic progress tracking (checks every 60s)
+* Supports configuration from direct parameters or existing assistants
+* Stores results for comparison and analysis
+* Use `GET /evaluations/{evaluation_id}` to monitor progress and retrieve results of evaluation.
 
-## Request Body
-
-- **dataset_id** (required): ID of the evaluation dataset (from /evaluations/datasets)
-- **experiment_name** (required): Name for this evaluation experiment/run
-- **config** (optional): Configuration dict that will be used as-is in JSONL generation. Can include any OpenAI Responses API parameters like:
-  - model: str (e.g., "gpt-4o", "gpt-5")
-  - instructions: str
-  - tools: list (e.g., [{"type": "file_search", "vector_store_ids": [...]}])
-  - reasoning: dict (e.g., {"effort": "low"})
-  - text: dict (e.g., {"verbosity": "low"})
-  - temperature: float
-  - include: list (e.g., ["file_search_call.results"])
-  - Note: "input" will be added automatically from the dataset
-- **assistant_id** (optional): Assistant ID to fetch configuration from. If provided, configuration will be fetched from the assistant in the database. Config can be passed as empty dict {} when using assistant_id.
-
-## Example with config
+**Example: Using Direct Configuration**
 
 ```json
 {
-    "dataset_id": 123,
-    "experiment_name": "test_run",
-    "config": {
-        "model": "gpt-4.1",
-        "instructions": "You are a helpful FAQ assistant.",
-        "tools": [
-            {
-                "type": "file_search",
-                "vector_store_ids": ["vs_12345"],
-                "max_num_results": 3
-            }
-        ],
-        "include": ["file_search_call.results"]
-    }
+  "dataset_id": 123,
+  "experiment_name": "gpt4_file_search_test",
+  "config": {
+    "model": "gpt-4o",
+    "instructions": "You are a helpful FAQ assistant for farmers.",
+    "tools": [
+      {
+        "type": "file_search",
+        "vector_store_ids": ["vs_abc123"],
+        "max_num_results": 5
+      }
+    ],
+    "temperature": 0.7,
+    "include": ["file_search_call.results"]
+  }
 }
 ```
 
-## Example with assistant_id
+**Example: Using Existing Assistant**
 
 ```json
 {
-    "dataset_id": 123,
-    "experiment_name": "test_run",
-    "config": {},
-    "assistant_id": "asst_xyz"
+  "dataset_id": 123,
+  "experiment_name": "production_assistant_eval",
+  "config": {},
+  "assistant_id": "asst_xyz789"
 }
 ```
 
-## Returns
-
-EvaluationRunPublic with batch details and status:
-- id: Evaluation run ID
-- run_name: Name of the evaluation run
-- dataset_name: Name of the dataset used
-- dataset_id: ID of the dataset used
-- config: Configuration used for the evaluation
-- batch_job_id: ID of the batch job processing this evaluation
-- status: Current status (pending, running, completed, failed)
-- total_items: Total number of items being evaluated
-- completed_items: Number of items completed so far
-- results: Evaluation results (when completed)
-- error_message: Error message if failed
-
-## Error Responses
-
-- **404**: Dataset or assistant not found or not accessible
-- **400**: Missing required credentials (OpenAI or Langfuse), dataset missing Langfuse ID, or config missing required fields
-- **500**: Failed to configure API clients or start batch evaluation
+**Note:** When using `assistant_id`, configuration is fetched from the assistant in the database. You can pass `config` as an empty object `{}`.
diff --git a/backend/app/api/docs/evaluation/delete_dataset.md b/backend/app/api/docs/evaluation/delete_dataset.md
index 461c30fce..d50802e82 100644
--- a/backend/app/api/docs/evaluation/delete_dataset.md
+++ b/backend/app/api/docs/evaluation/delete_dataset.md
@@ -1,18 +1,3 @@
 Delete a dataset by ID.
 
-This will remove the dataset record from the database. The CSV file in object store (if exists) will remain for audit purposes, but the dataset will no longer be accessible for creating new evaluations.
-
-## Path Parameters
-
-- **dataset_id**: ID of the dataset to delete
-
-## Returns
-
-Success message with deleted dataset details:
-- message: Confirmation message
-- dataset_id: ID of the deleted dataset
-
-## Error Responses
-
-- **404**: Dataset not found or not accessible to your organization/project
-- **400**: Dataset cannot be deleted (e.g., has active evaluation runs)
+This will remove the dataset record from the database. The CSV file in object store (if exists) will remain there for audit purposes, but the dataset will no longer be accessible for creating new evaluations.
diff --git a/backend/app/api/docs/evaluation/get_dataset.md b/backend/app/api/docs/evaluation/get_dataset.md
index 02e1e73aa..a1a27276a 100644
--- a/backend/app/api/docs/evaluation/get_dataset.md
+++ b/backend/app/api/docs/evaluation/get_dataset.md
@@ -1,22 +1,3 @@
 Get details of a specific dataset by ID.
 
-Retrieves comprehensive information about a dataset including metadata, object store URL, and Langfuse integration details.
-
-## Path Parameters
-
-- **dataset_id**: ID of the dataset to retrieve
-
-## Returns
-
-DatasetUploadResponse with dataset details:
-- dataset_id: Unique identifier for the dataset
-- dataset_name: Name of the dataset (sanitized)
-- total_items: Total number of items including duplication
-- original_items: Number of original items before duplication
-- duplication_factor: Factor by which items were duplicated
-- langfuse_dataset_id: ID of the dataset in Langfuse
-- object_store_url: URL to the CSV file in object storage
-
-## Error Responses
-
-- **404**: Dataset not found or not accessible to your organization/project
+Returns comprehensive dataset information including metadata (ID, name, item counts, duplication factor), Langfuse integration details (dataset ID), and the object store URL for the CSV file.
diff --git a/backend/app/api/docs/evaluation/get_evaluation.md b/backend/app/api/docs/evaluation/get_evaluation.md
index 97b094497..1e3186d2c 100644
--- a/backend/app/api/docs/evaluation/get_evaluation.md
+++ b/backend/app/api/docs/evaluation/get_evaluation.md
@@ -1,37 +1,12 @@
-Get the current status of a specific evaluation run.
+Get the current status and results of a specific evaluation run by the evaluation ID along with some optional query parameters listed below.
 
-Retrieves comprehensive information about an evaluation run including its current processing status, results (if completed), and error details (if failed).
+Returns comprehensive evaluation information including processing status, configuration, progress metrics, and detailed scores with Q&A context when requested. You can check this endpoint periodically to get to know the evaluation progress. Evaluations are processed asynchronously with status checks every 60 seconds.
 
-## Path Parameters
+**Query Parameters:**
+* `get_trace_info` (optional, default: false) - Include Langfuse trace scores with Q&A context. Data is fetched from Langfuse on first request and cached for subsequent calls. Only available for completed evaluations.
+* `resync_score` (optional, default: false) - Clear cached scores and re-fetch from Langfuse. Useful when evaluators have been updated. Requires `get_trace_info=true`.
 
-- **evaluation_id**: ID of the evaluation run
-
-## Query Parameters
-
-- **get_trace_info** (optional, default: false): If true, fetch and include Langfuse trace scores with Q&A context. On first request, data is fetched from Langfuse and cached in the score column. Subsequent requests return cached data. Only available for completed evaluations.
-
-- **resync_score** (optional, default: false): If true, clear cached scores and re-fetch from Langfuse. Useful when new evaluators have been added or scores have been updated. Requires get_trace_info=true.
-
-## Returns
-
-EvaluationRunPublic with current status and results:
-- id: Evaluation run ID
-- run_name: Name of the evaluation run
-- dataset_name: Name of the dataset used
-- dataset_id: ID of the dataset used
-- config: Configuration used for the evaluation
-- batch_job_id: ID of the batch job processing this evaluation
-- status: Current status (pending, running, completed, failed)
-- total_items: Total number of items being evaluated
-- completed_items: Number of items completed so far
-- score: Evaluation scores (when get_trace_info=true and status=completed)
-- error_message: Error message if failed
-- created_at: Timestamp when the evaluation was created
-- updated_at: Timestamp when the evaluation was last updated
-
-## Score Format
-
-When `get_trace_info=true` and evaluation is completed, the `score` field contains:
+**Score Format** (`get_trace_info=true`):
 
 ```json
 {
@@ -74,16 +49,8 @@ When `get_trace_info=true` and evaluation is completed, the `score` field contai
 }
 ```
 
-**Notes:**
-- Only complete scores are included (scores where all traces have been rated)
-- Numeric values are rounded to 2 decimal places
-- NUMERIC scores show `avg` and `std` in summary
-- CATEGORICAL scores show `distribution` counts in summary
-
-## Usage
-
-Use this endpoint to poll for evaluation progress. The evaluation is processed asynchronously by Celery Beat (every 60s), so you should poll periodically to check if the status has changed to "completed" or "failed".
-
-## Error Responses
-
-- **404**: Evaluation run not found or not accessible to this organization/project
+**Score Details:**
+* NUMERIC scores include average (`avg`) and standard deviation (`std`) in summary
+* CATEGORICAL scores include distribution counts in summary
+* Only complete scores are included (all traces have been rated)
+* Numeric values are rounded to 2 decimal places
diff --git a/backend/app/api/docs/evaluation/list_datasets.md b/backend/app/api/docs/evaluation/list_datasets.md
index bd5576efc..e315db1d0 100644
--- a/backend/app/api/docs/evaluation/list_datasets.md
+++ b/backend/app/api/docs/evaluation/list_datasets.md
@@ -1,19 +1,3 @@
 List all datasets for the current organization and project.
 
-Returns a paginated list of dataset records ordered by most recent first.
-
-## Query Parameters
-
-- **limit**: Maximum number of datasets to return (default 50, max 100)
-- **offset**: Number of datasets to skip for pagination (default 0)
-
-## Returns
-
-List of DatasetUploadResponse objects, each containing:
-- dataset_id: Unique identifier for the dataset
-- dataset_name: Name of the dataset (sanitized)
-- total_items: Total number of items including duplication
-- original_items: Number of original items before duplication
-- duplication_factor: Factor by which items were duplicated
-- langfuse_dataset_id: ID of the dataset in Langfuse
-- object_store_url: URL to the CSV file in object storage
+Returns a paginated list of datasets ordered by most recent first. Each dataset includes metadata (ID, name, item counts, duplication factor), Langfuse integration details, and object store URL.
diff --git a/backend/app/api/docs/evaluation/list_evaluations.md b/backend/app/api/docs/evaluation/list_evaluations.md
index 64c667726..24ab51623 100644
--- a/backend/app/api/docs/evaluation/list_evaluations.md
+++ b/backend/app/api/docs/evaluation/list_evaluations.md
@@ -1,25 +1,3 @@
 List all evaluation runs for the current organization and project.
 
-Returns a paginated list of evaluation runs ordered by most recent first. Each evaluation run represents a batch processing job evaluating a dataset against a specific configuration.
-
-## Query Parameters
-
-- **limit**: Maximum number of runs to return (default 50)
-- **offset**: Number of runs to skip (for pagination, default 0)
-
-## Returns
-
-List of EvaluationRunPublic objects, each containing:
-- id: Evaluation run ID
-- run_name: Name of the evaluation run
-- dataset_name: Name of the dataset used
-- dataset_id: ID of the dataset used
-- config: Configuration used for the evaluation
-- batch_job_id: ID of the batch job processing this evaluation
-- status: Current status (pending, running, completed, failed)
-- total_items: Total number of items being evaluated
-- completed_items: Number of items completed so far
-- results: Evaluation results (when completed)
-- error_message: Error message if failed
-- created_at: Timestamp when the evaluation was created
-- updated_at: Timestamp when the evaluation was last updated
+Returns a paginated list of evaluation runs ordered by most recent first. Each run includes metadata (ID, name, dataset info, timestamps), configuration details, batch job ID, status tracking (pending/running/completed/failed), progress metrics (total/completed items), and results when available.
diff --git a/backend/app/api/docs/evaluation/upload_dataset.md b/backend/app/api/docs/evaluation/upload_dataset.md
index b73902860..f4dcae356 100644
--- a/backend/app/api/docs/evaluation/upload_dataset.md
+++ b/backend/app/api/docs/evaluation/upload_dataset.md
@@ -1,42 +1,33 @@
-Upload a CSV file containing Golden Q&A pairs.
+Upload a CSV file containing golden Q&A pairs for evaluation.
 
-This endpoint:
-1. Sanitizes the dataset name (removes spaces, special characters)
-2. Validates and parses the CSV file
-3. Uploads CSV to object store (if credentials configured)
-4. Uploads dataset to Langfuse (for immediate use)
-5. Stores metadata in database
+Datasets allow you to store reusable question-answer pairs for systematic LLM testing with automatic validation, duplication for statistical significance, and Langfuse integration. Response includes dataset ID, sanitized name, item counts, Langfuse dataset ID, and object store URL (the cloud storage location where your CSV file is stored).
 
-## Dataset Name
+**Key Features:**
+* Validates CSV format and required columns (question, answer)
+* Automatic dataset name sanitization for Langfuse compatibility
+* Optional item duplication for statistical significance (1-5x, default: 1x)
+* Uploads to object store and syncs with Langfuse
+* Skips rows with missing values automatically
 
-- Will be sanitized for Langfuse compatibility
-- Spaces replaced with underscores
-- Special characters removed
-- Converted to lowercase
-- Example: "My Dataset 01!" becomes "my_dataset_01"
 
-## CSV Format
+**CSV Format Requirements:**
+* Required columns: `question`, `answer`
+* Additional columns are allowed (will be ignored)
+* Missing values in required columns are automatically skipped
 
-- Must contain 'question' and 'answer' columns
-- Can have additional columns (will be ignored)
-- Missing values in 'question' or 'answer' rows will be skipped
 
-## Duplication Factor
+**Dataset Name Sanitization:**
 
-- Minimum: 1 (no duplication)
-- Maximum: 5
-- Default: 5
-- Each item in the dataset will be duplicated this many times
-- Used to ensure statistical significance in evaluation results
+Your dataset name will be automatically sanitized for Langfuse compatibility:
+* Spaces → underscores
+* Special characters removed
+* Converted to lowercase
+* Example: `"My Dataset 01!"` → `"my_dataset_01"`
 
-## Example CSV
 
-```
-question,answer
-"What is the capital of France?","Paris"
-"What is 2+2?","4"
-```
+**Duplication Factor:**
 
-## Returns
-
-DatasetUploadResponse with dataset_id, object_store_url, and Langfuse details (dataset_name in response will be the sanitized version)
+Control how many times each Q&A pair is duplicated (1-5x, default: 1x):
+* Higher duplication = better statistical significance
+* Useful for batch evaluation reliability
+* `1` = no duplication (original dataset only)
diff --git a/backend/app/api/docs/fine_tuning/retrieve.md b/backend/app/api/docs/fine_tuning/retrieve.md
index 8dd93a841..95710eaf5 100644
--- a/backend/app/api/docs/fine_tuning/retrieve.md
+++ b/backend/app/api/docs/fine_tuning/retrieve.md
@@ -2,4 +2,6 @@ Refreshes the status of a fine-tuning job by retrieving the latest information f
 If there are any changes in status, fine-tuned model, or error message, the local job record is updated accordingly.
 Returns the latest state of the job.
 
-OpenAI’s job status is retrieved using their [Fine-tuning Job Retrieve API](https://platform.openai.com/docs/api-reference/fine_tuning/retrieve).
+When a job is completed and updated in the database, model evaluation for that fine-tuned model will start automatically.
+
+OpenAI's job status is retrieved using their [Fine-tuning Job Retrieve API](https://platform.openai.com/docs/api-reference/fine_tuning/retrieve).
diff --git a/backend/app/api/docs/llm/llm_call.md b/backend/app/api/docs/llm/llm_call.md
index 86513bc06..fec4fbc49 100644
--- a/backend/app/api/docs/llm/llm_call.md
+++ b/backend/app/api/docs/llm/llm_call.md
@@ -3,7 +3,7 @@ Make an LLM API call using either a stored configuration or an ad-hoc configurat
 This endpoint initiates an asynchronous LLM call job. The request is queued
 for processing, and results are delivered via the callback URL when complete.
 
-### Request Fields
+### Key Parameters
 
 **`query`** (required) - Query parameters for this LLM call:
 - `input` (required, string, min 1 char): User question/prompt/query
@@ -21,11 +21,15 @@ for processing, and results are delivered via the callback URL when complete.
   - **Note**: When using stored configuration, do not include the `blob` field in the request body
 
 - **Mode 2: Ad-hoc Configuration**
-  - `blob` (object): Complete configuration object (see Create Config endpoint documentation for examples)
-    - `completion` (required):
-      - `provider` (required, string): Currently only "openai"
-      - `params` (required, object): Provider-specific parameters (flexible JSON)
-  - **Note**: When using ad-hoc configuration, do not include `id` and `version` fields
+  - `blob` (object): Complete configuration object
+    - `completion` (required, object): Completion configuration
+      - `provider` (required, string): Provider type - either `"openai"` (Kaapi abstraction) or `"openai-native"` (pass-through)
+      - `params` (required, object): Parameters structure depends on provider type (see schema for detailed structure)
+  - **Note**
+    - When using ad-hoc configuration, do not include `id` and `version` fields
+    - When using the Kaapi abstraction, parameters that are not supported by the selected provider or model are automatically suppressed. If any parameters are ignored, a list of warnings is included in the metadata.warnings. For example, the GPT-5 model does not support the temperature parameter, so Kaapi will neither throw an error nor pass this parameter to the model; instead, it will return a warning in the metadata.warnings response.
+  - **Recommendation**: Use stored configs (Mode 1) for production; use ad-hoc configs only for testing/validation
+  - **Schema**: Check the API schema or examples below for the complete parameter structure for each provider type
 
 **`callback_url`** (optional, HTTPS URL):
 - Webhook endpoint to receive the response
@@ -39,4 +43,7 @@ for processing, and results are delivered via the callback URL when complete.
 - Custom JSON metadata
 - Passed through unchanged in the response
 
+### Note
+- `warnings` list is automatically added in response metadata when using Kaapi configs if any parameters are suppressed or adjusted (e.g., temperature on reasoning models)
+
 ---
diff --git a/backend/app/api/docs/model_evaluation/evaluate.md b/backend/app/api/docs/model_evaluation/evaluate.md
new file mode 100644
index 000000000..d4c276cec
--- /dev/null
+++ b/backend/app/api/docs/model_evaluation/evaluate.md
@@ -0,0 +1,3 @@
+Start evaluations for one or more fine-tuned models.
+
+For each fine-tuning job ID provided, this endpoint fetches the fine-tuned model and test data, then queues a background task that runs predictions on the test set and computes evaluation scores (Matthews correlation coefficient). Returns created or active evaluation records.
diff --git a/backend/app/api/docs/model_evaluation/get_top_model.md b/backend/app/api/docs/model_evaluation/get_top_model.md
new file mode 100644
index 000000000..5b24e6988
--- /dev/null
+++ b/backend/app/api/docs/model_evaluation/get_top_model.md
@@ -0,0 +1,3 @@
+Get the top-performing model for a specific document.
+
+Returns the best model trained on the given document, ranked by Matthews correlation coefficient (MCC) across all evaluations. Includes prediction data file URL if available.
diff --git a/backend/app/api/docs/model_evaluation/list_by_document.md b/backend/app/api/docs/model_evaluation/list_by_document.md
new file mode 100644
index 000000000..2325d6a4e
--- /dev/null
+++ b/backend/app/api/docs/model_evaluation/list_by_document.md
@@ -0,0 +1,3 @@
+Get all model evaluations for a specific document.
+
+Returns list of all evaluation records for models trained on the given document within the current project, including prediction data file URLs.
diff --git a/backend/app/api/docs/onboarding/onboarding.md b/backend/app/api/docs/onboarding/onboarding.md
index b6816c60a..58eeb7379 100644
--- a/backend/app/api/docs/onboarding/onboarding.md
+++ b/backend/app/api/docs/onboarding/onboarding.md
@@ -20,9 +20,14 @@
 
 ## 🔑 Credentials (Optional)
 - If provided, the given credentials will be **encrypted** and stored as project credentials.
-- The `credential` parameter accepts a list of one or more credentials (e.g., an OpenAI key, Langfuse credentials, etc.).
+- The `credentials` parameter accepts a list of one or more credentials (e.g., an OpenAI key, Langfuse credentials, etc.).
 - If omitted, the project will be created **without credentials**.
 - We’ve also included a list of the providers currently supported by kaapi.
+
+   ### Supported Providers
+   - **LLM:** openai
+   - **Observability:** langfuse
+
    ### Example: For sending multiple credentials -
    ```
    "credentials": [
@@ -40,9 +45,6 @@
      }
    ]
    ```
-  ### Supported Providers
-    - openai
-    - langfuse
 ---
 
 ## 🔄 Transactional Guarantee
diff --git a/backend/app/api/docs/openai_conversation/delete.md b/backend/app/api/docs/openai_conversation/delete.md
new file mode 100644
index 000000000..905c0b9d2
--- /dev/null
+++ b/backend/app/api/docs/openai_conversation/delete.md
@@ -0,0 +1,3 @@
+Delete a conversation by its ID.
+
+Performs a delete by marking the conversation as deleted. The conversation remains in the database but is hidden from listings.
diff --git a/backend/app/api/docs/openai_conversation/get.md b/backend/app/api/docs/openai_conversation/get.md
new file mode 100644
index 000000000..69b3413ad
--- /dev/null
+++ b/backend/app/api/docs/openai_conversation/get.md
@@ -0,0 +1,3 @@
+Get a single conversation by its ID.
+
+Returns conversation details for the specified conversation ID within the current project.
diff --git a/backend/app/api/docs/openai_conversation/get_by_ancestor_id.md b/backend/app/api/docs/openai_conversation/get_by_ancestor_id.md
new file mode 100644
index 000000000..2b1aa1cc0
--- /dev/null
+++ b/backend/app/api/docs/openai_conversation/get_by_ancestor_id.md
@@ -0,0 +1,3 @@
+Get a conversation by its ancestor response ID.
+
+Retrieves conversation details using the ancestor response ID for conversation chain lookup.
diff --git a/backend/app/api/docs/openai_conversation/get_by_response_id.md b/backend/app/api/docs/openai_conversation/get_by_response_id.md
new file mode 100644
index 000000000..d75da7988
--- /dev/null
+++ b/backend/app/api/docs/openai_conversation/get_by_response_id.md
@@ -0,0 +1,3 @@
+Get a conversation by its OpenAI response ID.
+
+Retrieves conversation details using the OpenAI Responses API response ID for lookup.
diff --git a/backend/app/api/docs/openai_conversation/list.md b/backend/app/api/docs/openai_conversation/list.md
new file mode 100644
index 000000000..253cf3af7
--- /dev/null
+++ b/backend/app/api/docs/openai_conversation/list.md
@@ -0,0 +1,3 @@
+List all conversations in the current project.
+
+Returns paginated list of conversations with total count metadata for the current project.
diff --git a/backend/app/api/docs/openapi_config.py b/backend/app/api/docs/openapi_config.py
new file mode 100644
index 000000000..6de881b2d
--- /dev/null
+++ b/backend/app/api/docs/openapi_config.py
@@ -0,0 +1,135 @@
+"""
+OpenAPI schema customization for ReDoc documentation.
+
+This module contains tag metadata and custom OpenAPI extensions
+for organizing and enhancing the API documentation.
+"""
+
+# Tag metadata for organizing endpoints in documentation
+tags_metadata = [
+    {
+        "name": "Onboarding",
+        "description": "Getting started with the platform",
+    },
+    {
+        "name": "Documents",
+        "description": "Document upload, transformation, and management operations",
+    },
+    {
+        "name": "Collections",
+        "description": "Collection creation, deletion, and management for vector stores and assistants",
+    },
+    {
+        "name": "Config Management",
+        "description": "Configuration management operations",
+    },
+    {
+        "name": "LLM",
+        "description": "Large Language Model inference and interaction endpoints",
+    },
+    {
+        "name": "Evaluation",
+        "description": "Dataset upload, running evaluations, listing datasets as well as evaluations",
+    },
+    {
+        "name": "Fine Tuning",
+        "description": "Fine tuning LLM for specific use cases by providing labelled dataset",
+    },
+    {
+        "name": "Model Evaluation",
+        "description": "Fine tuned model performance evaluation and benchmarking",
+    },
+    {
+        "name": "Responses",
+        "description": "OpenAI Responses API integration for managing LLM conversations",
+    },
+    {
+        "name": "OpenAI Conversations",
+        "description": "OpenAI conversation management and interaction",
+    },
+    {
+        "name": "Users",
+        "description": "User account management and operations",
+    },
+    {
+        "name": "Organizations",
+        "description": "Organization management and settings",
+    },
+    {
+        "name": "Projects",
+        "description": "Project management operations",
+    },
+    {
+        "name": "API Keys",
+        "description": "API key generation and management",
+    },
+    {
+        "name": "Credentials",
+        "description": "Credential management and authentication",
+    },
+    {"name": "Login", "description": "User authentication and login operations"},
+    {
+        "name": "Assistants",
+        "description": "[**Deprecated**] OpenAI Assistant creation and management. This feature will be removed in a future version.",
+    },
+    {
+        "name": "Threads",
+        "description": "[**Deprecated**] Conversation thread management for assistants. This feature will be removed in a future version.",
+    },
+]
+
+# ReDoc-specific extension: x-tagGroups for hierarchical organization
+# This creates collapsible groups in the ReDoc sidebar
+tag_groups = [
+    {"name": "Get Started", "tags": ["Onboarding"]},
+    {
+        "name": "Capabilities",
+        "tags": [
+            "Documents",
+            "Collections",
+            "Config Management",
+            "LLM",
+            "Evaluation",
+            "Fine Tuning",
+            "Model Evaluation",
+            "Responses",
+            "OpenAI Conversations",
+            "Assistants",
+            "Threads",
+        ],
+    },
+    {
+        "name": "Administration",
+        "tags": [
+            "Users",
+            "Organizations",
+            "Projects",
+            "API Keys",
+            "Credentials",
+            "Login",
+        ],
+    },
+]
+
+
+def customize_openapi_schema(openapi_schema: dict) -> dict:
+    """
+    Add custom OpenAPI extensions to the schema.
+
+    Args:
+        openapi_schema: The base OpenAPI schema from FastAPI
+
+    Returns:
+        The customized OpenAPI schema with x-tagGroups and other extensions
+    """
+    openapi_schema["x-tagGroups"] = tag_groups
+    deprecated_tags = ["Assistants", "Threads"]
+
+    for _, path_item in openapi_schema.get("paths", {}).items():
+        for method, operation in path_item.items():
+            if method in ["get", "post", "put", "delete", "patch"]:
+                operation_tags = operation.get("tags", [])
+                if any(tag in deprecated_tags for tag in operation_tags):
+                    operation["x-badges"] = [{"name": "Deprecated", "color": "orange"}]
+
+    return openapi_schema
diff --git a/backend/app/api/docs/organization/create.md b/backend/app/api/docs/organization/create.md
new file mode 100644
index 000000000..7f7e284ad
--- /dev/null
+++ b/backend/app/api/docs/organization/create.md
@@ -0,0 +1,3 @@
+Create a new organization.
+
+Creates a new organization with the specified name and details.
diff --git a/backend/app/api/docs/organization/delete.md b/backend/app/api/docs/organization/delete.md
new file mode 100644
index 000000000..e0841c04a
--- /dev/null
+++ b/backend/app/api/docs/organization/delete.md
@@ -0,0 +1,3 @@
+Delete an organization.
+
+Permanently deletes an organization and all associated data.
diff --git a/backend/app/api/docs/organization/get.md b/backend/app/api/docs/organization/get.md
new file mode 100644
index 000000000..c64242d3e
--- /dev/null
+++ b/backend/app/api/docs/organization/get.md
@@ -0,0 +1,3 @@
+Get organization details by ID.
+
+Returns details for a specific organization.
diff --git a/backend/app/api/docs/organization/list.md b/backend/app/api/docs/organization/list.md
new file mode 100644
index 000000000..95943bab2
--- /dev/null
+++ b/backend/app/api/docs/organization/list.md
@@ -0,0 +1,3 @@
+List all organizations.
+
+Returns paginated list of all organizations in the system.
diff --git a/backend/app/api/docs/organization/update.md b/backend/app/api/docs/organization/update.md
new file mode 100644
index 000000000..388c3eccd
--- /dev/null
+++ b/backend/app/api/docs/organization/update.md
@@ -0,0 +1,3 @@
+Update organization details.
+
+Updates name and description for an existing organization.
diff --git a/backend/app/api/docs/projects/create.md b/backend/app/api/docs/projects/create.md
new file mode 100644
index 000000000..b7397baac
--- /dev/null
+++ b/backend/app/api/docs/projects/create.md
@@ -0,0 +1,3 @@
+Create a new project.
+
+Creates a new project within an organization with the specified name and description.
diff --git a/backend/app/api/docs/projects/delete.md b/backend/app/api/docs/projects/delete.md
new file mode 100644
index 000000000..8afee4da9
--- /dev/null
+++ b/backend/app/api/docs/projects/delete.md
@@ -0,0 +1,3 @@
+Delete a project.
+
+Permanently deletes a project and all associated data including documents, collections, and configurations.
diff --git a/backend/app/api/docs/projects/get.md b/backend/app/api/docs/projects/get.md
new file mode 100644
index 000000000..69f7e1378
--- /dev/null
+++ b/backend/app/api/docs/projects/get.md
@@ -0,0 +1,3 @@
+Get project details by ID.
+
+Returns details for a specific project including name, organization, and description.
diff --git a/backend/app/api/docs/projects/list.md b/backend/app/api/docs/projects/list.md
new file mode 100644
index 000000000..911f7d1dc
--- /dev/null
+++ b/backend/app/api/docs/projects/list.md
@@ -0,0 +1,3 @@
+List all projects.
+
+Returns paginated list of all projects across all organizations.
diff --git a/backend/app/api/docs/projects/update.md b/backend/app/api/docs/projects/update.md
new file mode 100644
index 000000000..021ae15ce
--- /dev/null
+++ b/backend/app/api/docs/projects/update.md
@@ -0,0 +1,3 @@
+Update project details.
+
+Updates name and description for an existing project.
diff --git a/backend/app/api/docs/responses/create_async.md b/backend/app/api/docs/responses/create_async.md
new file mode 100644
index 000000000..f2dabda5c
--- /dev/null
+++ b/backend/app/api/docs/responses/create_async.md
@@ -0,0 +1,3 @@
+Create an asynchronous OpenAI Responses API call.
+
+Processes requests with background execution. Returns job status immediately and delivers results via callback given in the request body when completed.
diff --git a/backend/app/api/docs/responses/create_sync.md b/backend/app/api/docs/responses/create_sync.md
new file mode 100644
index 000000000..11b4c7b60
--- /dev/null
+++ b/backend/app/api/docs/responses/create_sync.md
@@ -0,0 +1,3 @@
+Create a synchronous OpenAI Responses API call.
+
+Synchronous endpoint for immediate responses with Langfuse tracing integration. Useful for benchmarking and testing.
diff --git a/backend/app/api/routes/api_keys.py b/backend/app/api/routes/api_keys.py
index d1821a356..723eecc84 100644
--- a/backend/app/api/routes/api_keys.py
+++ b/backend/app/api/routes/api_keys.py
@@ -4,7 +4,7 @@
 from app.api.deps import SessionDep, AuthContextDep
 from app.crud.api_key import APIKeyCrud
 from app.models import APIKeyPublic, APIKeyCreateResponse, Message
-from app.utils import APIResponse
+from app.utils import APIResponse, load_description
 from app.api.permissions import Permission, require_permission
 
 router = APIRouter(prefix="/apikeys", tags=["API Keys"])
@@ -15,6 +15,7 @@
     response_model=APIResponse[APIKeyCreateResponse],
     status_code=201,
     dependencies=[Depends(require_permission(Permission.SUPERUSER))],
+    description=load_description("api_keys/create.md"),
 )
 def create_api_key_route(
     project_id: int,
@@ -22,12 +23,6 @@ def create_api_key_route(
     current_user: AuthContextDep,
     session: SessionDep,
 ):
-    """
-    Create a new API key for the project and user, Restricted to Superuser.
-
-    The raw API key is returned only once during creation.
-    Store it securely as it cannot be retrieved again.
-    """
     api_key_crud = APIKeyCrud(session=session, project_id=project_id)
     raw_key, api_key = api_key_crud.create(
         user_id=user_id,
@@ -47,6 +42,7 @@ def create_api_key_route(
     "/",
     response_model=APIResponse[list[APIKeyPublic]],
     dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
+    description=load_description("api_keys/list.md"),
 )
 def list_api_keys_route(
     current_user: AuthContextDep,
@@ -54,13 +50,7 @@ def list_api_keys_route(
     skip: int = Query(0, ge=0, description="Number of records to skip"),
     limit: int = Query(100, ge=1, le=100, description="Maximum records to return"),
 ):
-    """
-    List all API keys for the current project.
-
-    Returns key prefix for security - the full key is only shown during creation.
-    Supports pagination via skip and limit parameters.
-    """
-    crud = APIKeyCrud(session, current_user.project.id)
+    crud = APIKeyCrud(session, current_user.project_.id)
     api_keys = crud.read_all(skip=skip, limit=limit)
 
     return APIResponse.success_response(api_keys)
@@ -70,16 +60,14 @@ def list_api_keys_route(
     "/{key_id}",
     response_model=APIResponse[Message],
     dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
+    description=load_description("api_keys/delete.md"),
 )
 def delete_api_key_route(
     key_id: UUID,
     current_user: AuthContextDep,
     session: SessionDep,
 ):
-    """
-    Delete an API key by its ID.
-    """
-    api_key_crud = APIKeyCrud(session=session, project_id=current_user.project.id)
+    api_key_crud = APIKeyCrud(session=session, project_id=current_user.project_.id)
     api_key_crud.delete(key_id=key_id)
 
     return APIResponse.success_response(Message(message="API Key deleted successfully"))
diff --git a/backend/app/api/routes/assistants.py b/backend/app/api/routes/assistants.py
index 3ed327f2d..24f7a502f 100644
--- a/backend/app/api/routes/assistants.py
+++ b/backend/app/api/routes/assistants.py
@@ -1,9 +1,8 @@
 from typing import Annotated
 
 from fastapi import APIRouter, Depends, Path, HTTPException, Query
-from sqlmodel import Session
 
-from app.api.deps import get_db, get_current_user_org_project
+from app.api.deps import AuthContextDep, SessionDep
 from app.crud import (
     fetch_assistant_from_openai,
     sync_assistant,
@@ -13,7 +12,8 @@
     get_assistants_by_project,
     delete_assistant,
 )
-from app.models import UserProjectOrg, AssistantCreate, AssistantUpdate, Assistant
+from app.models import AssistantCreate, AssistantUpdate, Assistant
+from app.api.permissions import Permission, require_permission
 from app.utils import APIResponse, get_openai_client
 
 router = APIRouter(prefix="/assistant", tags=["Assistants"])
@@ -23,71 +23,81 @@
     "/{assistant_id}/ingest",
     response_model=APIResponse[Assistant],
     status_code=201,
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def ingest_assistant_route(
     assistant_id: Annotated[str, Path(description="The ID of the assistant to ingest")],
-    session: Session = Depends(get_db),
-    current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    session: SessionDep,
+    current_user: AuthContextDep,
 ):
     """
     Ingest an assistant from OpenAI and store it in the platform.
     """
 
     client = get_openai_client(
-        session, current_user.organization_id, current_user.project_id
+        session, current_user.organization_.id, current_user.project_.id
     )
 
     openai_assistant = fetch_assistant_from_openai(assistant_id, client)
     assistant = sync_assistant(
         session=session,
-        organization_id=current_user.organization_id,
-        project_id=current_user.project_id,
+        organization_id=current_user.organization_.id,
+        project_id=current_user.project_.id,
         openai_assistant=openai_assistant,
     )
 
     return APIResponse.success_response(assistant)
 
 
-@router.post("/", response_model=APIResponse[Assistant], status_code=201)
+@router.post(
+    "/",
+    response_model=APIResponse[Assistant],
+    status_code=201,
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
+)
 def create_assistant_route(
     assistant_in: AssistantCreate,
-    session: Session = Depends(get_db),
-    current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    session: SessionDep,
+    current_user: AuthContextDep,
 ):
     """
     Create a new assistant in the local DB, checking that vector store IDs exist in OpenAI first.
     """
     client = get_openai_client(
-        session, current_user.organization_id, current_user.project_id
+        session, current_user.organization_.id, current_user.project_.id
     )
     assistant = create_assistant(
         session=session,
         openai_client=client,
         assistant=assistant_in,
-        project_id=current_user.project_id,
-        organization_id=current_user.organization_id,
+        project_id=current_user.project_.id,
+        organization_id=current_user.organization_.id,
     )
     return APIResponse.success_response(assistant)
 
 
-@router.patch("/{assistant_id}", response_model=APIResponse[Assistant])
+@router.patch(
+    "/{assistant_id}",
+    response_model=APIResponse[Assistant],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
+)
 def update_assistant_route(
     assistant_id: Annotated[str, Path(description="Assistant ID to update")],
     assistant_update: AssistantUpdate,
-    session: Session = Depends(get_db),
-    current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    session: SessionDep,
+    current_user: AuthContextDep,
 ):
     """
     Update an existing assistant with provided fields. Supports replacing, adding, or removing vector store IDs.
     """
     client = get_openai_client(
-        session, current_user.organization_id, current_user.project_id
+        session, current_user.organization_.id, current_user.project_.id
     )
     updated_assistant = update_assistant(
         session=session,
         assistant_id=assistant_id,
         openai_client=client,
-        project_id=current_user.project_id,
+        project_id=current_user.project_.id,
         assistant_update=assistant_update,
     )
     return APIResponse.success_response(updated_assistant)
@@ -97,16 +107,17 @@ def update_assistant_route(
     "/{assistant_id}",
     response_model=APIResponse[Assistant],
     summary="Get a single assistant by its ID",
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def get_assistant_route(
-    assistant_id: str = Path(..., description="The assistant_id to fetch"),
-    session: Session = Depends(get_db),
-    current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    assistant_id: Annotated[str, Path(description="The assistant_id to fetch")],
+    session: SessionDep,
+    current_user: AuthContextDep,
 ):
     """
     Fetch a single assistant by its assistant_id.
     """
-    assistant = get_assistant_by_id(session, assistant_id, current_user.project_id)
+    assistant = get_assistant_by_id(session, assistant_id, current_user.project_.id)
     if not assistant:
         raise HTTPException(
             status_code=404, detail=f"Assistant with ID {assistant_id} not found."
@@ -118,10 +129,11 @@ def get_assistant_route(
     "/",
     response_model=APIResponse[list[Assistant]],
     summary="List all assistants in the current project",
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def list_assistants_route(
-    session: Session = Depends(get_db),
-    current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    session: SessionDep,
+    current_user: AuthContextDep,
     skip: int = Query(0, ge=0, description="How many items to skip"),
     limit: int = Query(100, ge=1, le=100, description="Maximum items to return"),
 ):
@@ -130,16 +142,20 @@ def list_assistants_route(
     """
 
     assistants = get_assistants_by_project(
-        session=session, project_id=current_user.project_id, skip=skip, limit=limit
+        session=session, project_id=current_user.project_.id, skip=skip, limit=limit
     )
     return APIResponse.success_response(assistants)
 
 
-@router.delete("/{assistant_id}", response_model=APIResponse)
+@router.delete(
+    "/{assistant_id}",
+    response_model=APIResponse,
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
+)
 def delete_assistant_route(
     assistant_id: Annotated[str, Path(description="Assistant ID to delete")],
-    session: Session = Depends(get_db),
-    current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    session: SessionDep,
+    current_user: AuthContextDep,
 ):
     """
     Soft delete an assistant by marking it as deleted.
@@ -147,7 +163,7 @@ def delete_assistant_route(
     delete_assistant(
         session=session,
         assistant_id=assistant_id,
-        project_id=current_user.project_id,
+        project_id=current_user.project_.id,
     )
     return APIResponse.success_response(
         data={"message": "Assistant deleted successfully."}
diff --git a/backend/app/api/routes/collection_job.py b/backend/app/api/routes/collection_job.py
index 5636ed8f4..31686c83e 100644
--- a/backend/app/api/routes/collection_job.py
+++ b/backend/app/api/routes/collection_job.py
@@ -1,11 +1,12 @@
 import logging
 from uuid import UUID
 
-from fastapi import APIRouter
+from fastapi import APIRouter, Depends
 from fastapi import Path as FastPath
 
 
-from app.api.deps import SessionDep, CurrentUserOrgProject
+from app.api.deps import AuthContextDep, SessionDep
+from app.api.permissions import Permission, require_permission
 from app.crud import (
     CollectionCrud,
     CollectionJobCrud,
@@ -22,20 +23,21 @@
 
 
 logger = logging.getLogger(__name__)
-router = APIRouter(prefix="/collections", tags=["collections"])
+router = APIRouter(prefix="/collections", tags=["Collections"])
 
 
 @router.get(
     "/jobs/{job_id}",
     description=load_description("collections/job_info.md"),
     response_model=APIResponse[CollectionJobPublic],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def collection_job_info(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     job_id: UUID = FastPath(description="Collection job to retrieve"),
 ):
-    collection_job_crud = CollectionJobCrud(session, current_user.project_id)
+    collection_job_crud = CollectionJobCrud(session, current_user.project_.id)
     collection_job = collection_job_crud.read_one(job_id)
 
     job_out = CollectionJobPublic.model_validate(collection_job)
@@ -45,7 +47,7 @@ def collection_job_info(
             collection_job.action_type == CollectionActionType.CREATE
             and collection_job.status == CollectionJobStatus.SUCCESSFUL
         ):
-            collection_crud = CollectionCrud(session, current_user.project_id)
+            collection_crud = CollectionCrud(session, current_user.project_.id)
             collection = collection_crud.read_one(collection_job.collection_id)
             job_out.collection = CollectionPublic.model_validate(collection)
 
diff --git a/backend/app/api/routes/collections.py b/backend/app/api/routes/collections.py
index ed66cd04d..d19fad31f 100644
--- a/backend/app/api/routes/collections.py
+++ b/backend/app/api/routes/collections.py
@@ -2,10 +2,11 @@
 from uuid import UUID
 from typing import List
 
-from fastapi import APIRouter, Query, Body
+from fastapi import APIRouter, Query, Body, Depends
 from fastapi import Path as FastPath
 
-from app.api.deps import SessionDep, CurrentUserOrgProject
+from app.api.deps import SessionDep, AuthContextDep
+from app.api.permissions import Permission, require_permission
 from app.crud import (
     CollectionCrud,
     CollectionJobCrud,
@@ -35,7 +36,7 @@
 
 logger = logging.getLogger(__name__)
 
-router = APIRouter(prefix="/collections", tags=["collections"])
+router = APIRouter(prefix="/collections", tags=["Collections"])
 collection_callback_router = APIRouter()
 
 
@@ -59,12 +60,13 @@ def collection_callback_notification(body: APIResponse[CollectionJobPublic]):
     "/",
     description=load_description("collections/list.md"),
     response_model=APIResponse[List[CollectionPublic]],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def list_collections(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
 ):
-    collection_crud = CollectionCrud(session, current_user.project_id)
+    collection_crud = CollectionCrud(session, current_user.project_.id)
     rows = collection_crud.read_all()
 
     return APIResponse.success_response(rows)
@@ -75,20 +77,21 @@ def list_collections(
     description=load_description("collections/create.md"),
     response_model=APIResponse[CollectionJobImmediatePublic],
     callbacks=collection_callback_router.routes,
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def create_collection(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     request: CreationRequest,
 ):
     if request.callback_url:
         validate_callback_url(str(request.callback_url))
 
-    collection_job_crud = CollectionJobCrud(session, current_user.project_id)
+    collection_job_crud = CollectionJobCrud(session, current_user.project_.id)
     collection_job = collection_job_crud.create(
         CollectionJobCreate(
             action_type=CollectionActionType.CREATE,
-            project_id=current_user.project_id,
+            project_id=current_user.project_.id,
             status=CollectionJobStatus.PENDING,
         )
     )
@@ -102,8 +105,8 @@ def create_collection(
         db=session,
         request=request,
         collection_job_id=collection_job.id,
-        project_id=current_user.project_id,
-        organization_id=current_user.organization_id,
+        project_id=current_user.project_.id,
+        organization_id=current_user.organization_.id,
         with_assistant=with_assistant,
     )
 
@@ -126,28 +129,29 @@ def create_collection(
     description=load_description("collections/delete.md"),
     response_model=APIResponse[CollectionJobImmediatePublic],
     callbacks=collection_callback_router.routes,
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def delete_collection(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     collection_id: UUID = FastPath(description="Collection to delete"),
     request: CallbackRequest | None = Body(default=None),
 ):
     if request and request.callback_url:
         validate_callback_url(str(request.callback_url))
 
-    _ = CollectionCrud(session, current_user.project_id).read_one(collection_id)
+    _ = CollectionCrud(session, current_user.project_.id).read_one(collection_id)
 
     deletion_request = DeletionRequest(
         collection_id=collection_id,
         callback_url=request.callback_url if request else None,
     )
 
-    collection_job_crud = CollectionJobCrud(session, current_user.project_id)
+    collection_job_crud = CollectionJobCrud(session, current_user.project_.id)
     collection_job = collection_job_crud.create(
         CollectionJobCreate(
             action_type=CollectionActionType.DELETE,
-            project_id=current_user.project_id,
+            project_id=current_user.project_.id,
             status=CollectionJobStatus.PENDING,
             collection_id=collection_id,
         )
@@ -157,8 +161,8 @@ def delete_collection(
         db=session,
         request=deletion_request,
         collection_job_id=collection_job.id,
-        project_id=current_user.project_id,
-        organization_id=current_user.organization_id,
+        project_id=current_user.project_.id,
+        organization_id=current_user.organization_.id,
     )
 
     return APIResponse.success_response(
@@ -170,10 +174,11 @@ def delete_collection(
     "/{collection_id}",
     description=load_description("collections/info.md"),
     response_model=APIResponse[CollectionWithDocsPublic],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def collection_info(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     collection_id: UUID = FastPath(description="Collection to retrieve"),
     include_docs: bool = Query(
         True,
@@ -182,7 +187,7 @@ def collection_info(
     skip: int = Query(0, ge=0),
     limit: int = Query(100, gt=0, le=100),
 ):
-    collection_crud = CollectionCrud(session, current_user.project_id)
+    collection_crud = CollectionCrud(session, current_user.project_.id)
     collection = collection_crud.read_one(collection_id)
 
     collection_with_docs = CollectionWithDocsPublic.model_validate(collection)
diff --git a/backend/app/api/routes/config/config.py b/backend/app/api/routes/config/config.py
index 18c3ca84e..6d2629442 100644
--- a/backend/app/api/routes/config/config.py
+++ b/backend/app/api/routes/config/config.py
@@ -33,7 +33,7 @@ def create_config(
     """
     create new config along with initial version
     """
-    config_crud = ConfigCrud(session=session, project_id=current_user.project.id)
+    config_crud = ConfigCrud(session=session, project_id=current_user.project_.id)
     config, version = config_crud.create_or_raise(config_create)
 
     response = ConfigWithVersion(**config.model_dump(), version=version)
@@ -60,7 +60,7 @@ def list_configs(
     List all configurations for the current project.
     Ordered by updated_at in descending order.
     """
-    config_crud = ConfigCrud(session=session, project_id=current_user.project.id)
+    config_crud = ConfigCrud(session=session, project_id=current_user.project_.id)
     configs = config_crud.read_all(skip=skip, limit=limit)
     return APIResponse.success_response(
         data=configs,
@@ -82,7 +82,7 @@ def get_config(
     """
     Get a specific configuration by its ID.
     """
-    config_crud = ConfigCrud(session=session, project_id=current_user.project.id)
+    config_crud = ConfigCrud(session=session, project_id=current_user.project_.id)
     config = config_crud.exists_or_raise(config_id=config_id)
     return APIResponse.success_response(
         data=config,
@@ -105,7 +105,7 @@ def update_config(
     """
     Update a specific configuration.
     """
-    config_crud = ConfigCrud(session=session, project_id=current_user.project.id)
+    config_crud = ConfigCrud(session=session, project_id=current_user.project_.id)
     config = config_crud.update_or_raise(
         config_id=config_id, config_update=config_update
     )
@@ -130,7 +130,7 @@ def delete_config(
     """
     Delete a specific configuration.
     """
-    config_crud = ConfigCrud(session=session, project_id=current_user.project.id)
+    config_crud = ConfigCrud(session=session, project_id=current_user.project_.id)
     config_crud.delete_or_raise(config_id=config_id)
 
     return APIResponse.success_response(
diff --git a/backend/app/api/routes/config/version.py b/backend/app/api/routes/config/version.py
index 48246dc85..5f3e8626a 100644
--- a/backend/app/api/routes/config/version.py
+++ b/backend/app/api/routes/config/version.py
@@ -33,7 +33,7 @@ def create_version(
     The version number is automatically incremented.
     """
     version_crud = ConfigVersionCrud(
-        session=session, project_id=current_user.project.id, config_id=config_id
+        session=session, project_id=current_user.project_.id, config_id=config_id
     )
     version = version_crud.create_or_raise(version_create=version_create)
 
@@ -61,7 +61,7 @@ def list_versions(
     Ordered by version number in descending order.
     """
     version_crud = ConfigVersionCrud(
-        session=session, project_id=current_user.project.id, config_id=config_id
+        session=session, project_id=current_user.project_.id, config_id=config_id
     )
     versions = version_crud.read_all(
         skip=skip,
@@ -91,7 +91,7 @@ def get_version(
     Get a specific version of a config.
     """
     version_crud = ConfigVersionCrud(
-        session=session, project_id=current_user.project.id, config_id=config_id
+        session=session, project_id=current_user.project_.id, config_id=config_id
     )
     version = version_crud.exists_or_raise(version_number=version_number)
     return APIResponse.success_response(
@@ -118,7 +118,7 @@ def delete_version(
     Delete a specific version of a config.
     """
     version_crud = ConfigVersionCrud(
-        session=session, project_id=current_user.project.id, config_id=config_id
+        session=session, project_id=current_user.project_.id, config_id=config_id
     )
     version_crud.delete_or_raise(version_number=version_number)
 
diff --git a/backend/app/api/routes/credentials.py b/backend/app/api/routes/credentials.py
index e2f43cd85..74e7d9622 100644
--- a/backend/app/api/routes/credentials.py
+++ b/backend/app/api/routes/credentials.py
@@ -2,7 +2,8 @@
 
 from fastapi import APIRouter, Depends
 
-from app.api.deps import SessionDep, get_current_user_org_project
+from app.api.deps import AuthContextDep, SessionDep
+from app.api.permissions import Permission, require_permission
 from app.core.exception_handlers import HTTPException
 from app.core.providers import validate_provider
 from app.crud.credentials import (
@@ -13,24 +14,24 @@
     set_creds_for_org,
     update_creds_for_org,
 )
-from app.models import CredsCreate, CredsPublic, CredsUpdate, UserProjectOrg
-from app.utils import APIResponse
+from app.models import CredsCreate, CredsPublic, CredsUpdate
+from app.utils import APIResponse, load_description
 
 logger = logging.getLogger(__name__)
-router = APIRouter(prefix="/credentials", tags=["credentials"])
+router = APIRouter(prefix="/credentials", tags=["Credentials"])
 
 
 @router.post(
     "/",
     response_model=APIResponse[list[CredsPublic]],
-    summary="Create new credentials for the current organization and project",
-    description="Creates new credentials for the caller's organization and project. Each organization can have different credentials for different providers and projects. Only one credential per provider is allowed per organization-project combination.",
+    description=load_description("credentials/create.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def create_new_credential(
     *,
     session: SessionDep,
     creds_in: CredsCreate,
-    _current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    _current_user: AuthContextDep,
 ):
     # Project comes from API key context; no cross-org check needed here
     # Database unique constraint ensures no duplicate credentials per provider-org-project combination
@@ -38,12 +39,12 @@ def create_new_credential(
     created_creds = set_creds_for_org(
         session=session,
         creds_add=creds_in,
-        organization_id=_current_user.organization_id,
-        project_id=_current_user.project_id,
+        organization_id=_current_user.organization_.id,
+        project_id=_current_user.project_.id,
     )
     if not created_creds:
         logger.error(
-            f"[create_new_credential] Failed to create credentials | organization_id: {_current_user.organization_id}, project_id: {_current_user.project_id}"
+            f"[create_new_credential] Failed to create credentials | organization_id: {_current_user.organization_.id}, project_id: {_current_user.project_.id}"
         )
         raise HTTPException(status_code=500, detail="Failed to create credentials")
 
@@ -53,18 +54,18 @@ def create_new_credential(
 @router.get(
     "/",
     response_model=APIResponse[list[CredsPublic]],
-    summary="Get all credentials for current org and project",
-    description="Retrieves all provider credentials associated with the caller's organization and project.",
+    description=load_description("credentials/list.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def read_credential(
     *,
     session: SessionDep,
-    _current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    _current_user: AuthContextDep,
 ):
     creds = get_creds_by_org(
         session=session,
-        org_id=_current_user.organization_id,
-        project_id=_current_user.project_id,
+        org_id=_current_user.organization_.id,
+        project_id=_current_user.project_.id,
     )
     if not creds:
         raise HTTPException(status_code=404, detail="Credentials not found")
@@ -75,21 +76,21 @@ def read_credential(
 @router.get(
     "/provider/{provider}",
     response_model=APIResponse[dict],
-    summary="Get specific provider credentials for current org and project",
-    description="Retrieves credentials for a specific provider (e.g., 'openai', 'anthropic') for the caller's organization and project.",
+    description=load_description("credentials/get_provider.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def read_provider_credential(
     *,
     session: SessionDep,
     provider: str,
-    _current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    _current_user: AuthContextDep,
 ):
     provider_enum = validate_provider(provider)
     credential = get_provider_credential(
         session=session,
-        org_id=_current_user.organization_id,
+        org_id=_current_user.organization_.id,
         provider=provider_enum,
-        project_id=_current_user.project_id,
+        project_id=_current_user.project_.id,
     )
     if credential is None:
         raise HTTPException(status_code=404, detail="Provider credentials not found")
@@ -100,18 +101,18 @@ def read_provider_credential(
 @router.patch(
     "/",
     response_model=APIResponse[list[CredsPublic]],
-    summary="Update credentials for current org and project",
-    description="Updates credentials for a specific provider of the caller's organization and project.",
+    description=load_description("credentials/update.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def update_credential(
     *,
     session: SessionDep,
     creds_in: CredsUpdate,
-    _current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    _current_user: AuthContextDep,
 ):
     if not creds_in or not creds_in.provider or not creds_in.credential:
         logger.error(
-            f"[update_credential] Invalid input | organization_id: {_current_user.organization_id}, project_id: {_current_user.project_id}"
+            f"[update_credential] Invalid input | organization_id: {_current_user.organization_.id}, project_id: {_current_user.project_.id}"
         )
         raise HTTPException(
             status_code=400, detail="Provider and credential must be provided"
@@ -120,9 +121,9 @@ def update_credential(
     # Pass project_id directly to the CRUD function since CredsUpdate no longer has this field
     updated_credential = update_creds_for_org(
         session=session,
-        org_id=_current_user.organization_id,
+        org_id=_current_user.organization_.id,
         creds_in=creds_in,
-        project_id=_current_user.project_id,
+        project_id=_current_user.project_.id,
     )
 
     return APIResponse.success_response(
@@ -133,20 +134,21 @@ def update_credential(
 @router.delete(
     "/provider/{provider}",
     response_model=APIResponse[dict],
-    summary="Delete specific provider credentials for current org and project",
+    description=load_description("credentials/delete_provider.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def delete_provider_credential(
     *,
     session: SessionDep,
     provider: str,
-    _current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    _current_user: AuthContextDep,
 ):
     provider_enum = validate_provider(provider)
     remove_provider_credential(
         session=session,
-        org_id=_current_user.organization_id,
+        org_id=_current_user.organization_.id,
         provider=provider_enum,
-        project_id=_current_user.project_id,
+        project_id=_current_user.project_.id,
     )
 
     return APIResponse.success_response(
@@ -157,18 +159,18 @@ def delete_provider_credential(
 @router.delete(
     "/",
     response_model=APIResponse[dict],
-    summary="Delete all credentials for current org and project",
-    description="Removes all credentials for the caller's organization and project. This is a hard delete operation that permanently removes credentials from the database.",
+    description=load_description("credentials/delete_all.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def delete_all_credentials(
     *,
     session: SessionDep,
-    _current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    _current_user: AuthContextDep,
 ):
     remove_creds_for_org(
         session=session,
-        org_id=_current_user.organization_id,
-        project_id=_current_user.project_id,
+        org_id=_current_user.organization_.id,
+        project_id=_current_user.project_.id,
     )
 
     return APIResponse.success_response(
diff --git a/backend/app/api/routes/cron.py b/backend/app/api/routes/cron.py
index a9e7b66ed..f04e11885 100644
--- a/backend/app/api/routes/cron.py
+++ b/backend/app/api/routes/cron.py
@@ -10,7 +10,7 @@
 
 logger = logging.getLogger(__name__)
 
-router = APIRouter(tags=["cron"])
+router = APIRouter(tags=["Cron"])
 
 
 @router.get(
diff --git a/backend/app/api/routes/doc_transformation_job.py b/backend/app/api/routes/doc_transformation_job.py
index dd5f98827..7af491af9 100644
--- a/backend/app/api/routes/doc_transformation_job.py
+++ b/backend/app/api/routes/doc_transformation_job.py
@@ -1,9 +1,10 @@
 from uuid import UUID
 import logging
 
-from fastapi import APIRouter, Query, Path
+from fastapi import APIRouter, Depends, Query, Path
 
-from app.api.deps import CurrentUserOrgProject, SessionDep
+from app.api.deps import AuthContextDep, SessionDep
+from app.api.permissions import Permission, require_permission
 from app.crud import DocTransformationJobCrud, DocumentCrud
 from app.models import (
     DocTransformationJobPublic,
@@ -15,28 +16,29 @@
 
 
 logger = logging.getLogger(__name__)
-router = APIRouter(prefix="/documents/transformation", tags=["documents"])
+router = APIRouter(prefix="/documents/transformation", tags=["Documents"])
 
 
 @router.get(
     "/{job_id}",
     description=load_description("documents/job_info.md"),
     response_model=APIResponse[DocTransformationJobPublic],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def get_transformation_job(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     job_id: UUID = Path(..., description="Transformation job ID"),
     include_url: bool = Query(
         False, description="Include a signed URL for the transformed document"
     ),
 ):
-    job_crud = DocTransformationJobCrud(session, current_user.project_id)
-    doc_crud = DocumentCrud(session, current_user.project_id)
+    job_crud = DocTransformationJobCrud(session, current_user.project_.id)
+    doc_crud = DocumentCrud(session, current_user.project_.id)
 
     job = job_crud.read_one(job_id)
     storage = (
-        get_cloud_storage(session=session, project_id=current_user.project_id)
+        get_cloud_storage(session=session, project_id=current_user.project_.id)
         if include_url
         else None
     )
@@ -54,10 +56,11 @@ def get_transformation_job(
     "/",
     description=load_description("documents/job_list.md"),
     response_model=APIResponse[DocTransformationJobsPublic],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def get_multiple_transformation_jobs(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     job_ids: list[UUID] = Query(
         ...,
         description="List of transformation job IDs",
@@ -68,15 +71,15 @@ def get_multiple_transformation_jobs(
         False, description="Include a signed URL for each transformed document"
     ),
 ):
-    job_crud = DocTransformationJobCrud(session, project_id=current_user.project_id)
-    doc_crud = DocumentCrud(session, project_id=current_user.project_id)
+    job_crud = DocTransformationJobCrud(session, project_id=current_user.project_.id)
+    doc_crud = DocumentCrud(session, project_id=current_user.project_.id)
 
     jobs = job_crud.read_each(set(job_ids))
     jobs_found_ids = {job.id for job in jobs}
     jobs_not_found = set(job_ids) - jobs_found_ids
 
     storage = (
-        get_cloud_storage(session=session, project_id=current_user.project_id)
+        get_cloud_storage(session=session, project_id=current_user.project_.id)
         if include_url
         else None
     )
diff --git a/backend/app/api/routes/documents.py b/backend/app/api/routes/documents.py
index 27671ef3c..16fdcfc64 100644
--- a/backend/app/api/routes/documents.py
+++ b/backend/app/api/routes/documents.py
@@ -5,6 +5,7 @@
 
 from fastapi import (
     APIRouter,
+    Depends,
     File,
     Form,
     Query,
@@ -13,7 +14,8 @@
 from pydantic import HttpUrl
 from fastapi import Path as FastPath
 
-from app.api.deps import CurrentUserOrgProject, SessionDep
+from app.api.deps import AuthContextDep, SessionDep
+from app.api.permissions import Permission, require_permission
 from app.core.cloud import get_cloud_storage
 from app.crud import CollectionCrud, DocumentCrud
 from app.crud.rag import OpenAIAssistantCrud, OpenAIVectorStoreCrud
@@ -42,7 +44,7 @@
 
 
 logger = logging.getLogger(__name__)
-router = APIRouter(prefix="/documents", tags=["documents"])
+router = APIRouter(prefix="/documents", tags=["Documents"])
 doctransformation_callback_router = APIRouter()
 
 
@@ -68,21 +70,22 @@ def doctransformation_callback_notification(
     "/",
     description=load_description("documents/list.md"),
     response_model=APIResponse[list[Union[DocumentPublic, TransformedDocumentPublic]]],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def list_docs(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     skip: int = Query(0, ge=0),
     limit: int = Query(100, gt=0, le=100),
     include_url: bool = Query(
         False, description="Include a signed URL to access each document"
     ),
 ):
-    crud = DocumentCrud(session, current_user.project_id)
+    crud = DocumentCrud(session, current_user.project_.id)
     documents = crud.read_many(skip, limit)
 
     storage = (
-        get_cloud_storage(session=session, project_id=current_user.project_id)
+        get_cloud_storage(session=session, project_id=current_user.project_.id)
         if include_url and documents
         else None
     )
@@ -100,10 +103,11 @@ def list_docs(
     description=load_description("documents/upload.md"),
     response_model=APIResponse[DocumentUploadResponse],
     callbacks=doctransformation_callback_router.routes,
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 async def upload_doc(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     src: UploadFile = File(...),
     target_format: str
     | None = Form(
@@ -126,11 +130,11 @@ async def upload_doc(
         transformer=transformer,
     )
 
-    storage = get_cloud_storage(session=session, project_id=current_user.project_id)
+    storage = get_cloud_storage(session=session, project_id=current_user.project_.id)
     document_id = uuid4()
     object_store_url = storage.put(src, Path(str(document_id)))
 
-    crud = DocumentCrud(session, current_user.project_id)
+    crud = DocumentCrud(session, current_user.project_.id)
     document = Document(
         id=document_id,
         fname=src.filename,
@@ -140,8 +144,7 @@ async def upload_doc(
 
     job_info: TransformationJobInfo | None = schedule_transformation(
         session=session,
-        project_id=current_user.project_id,
-        current_user=current_user,
+        project_id=current_user.project_.id,
         source_format=source_format,
         target_format=target_format,
         actual_transformer=actual_transformer,
@@ -167,20 +170,21 @@ async def upload_doc(
     "/{doc_id}",
     description=load_description("documents/delete.md"),
     response_model=APIResponse[Message],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def remove_doc(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     doc_id: UUID = FastPath(description="Document to delete"),
 ):
     client = get_openai_client(
-        session, current_user.organization_id, current_user.project_id
+        session, current_user.organization_.id, current_user.project_.id
     )
 
     a_crud = OpenAIAssistantCrud(client)
     v_crud = OpenAIVectorStoreCrud(client)
-    d_crud = DocumentCrud(session, current_user.project_id)
-    c_crud = CollectionCrud(session, current_user.project_id)
+    d_crud = DocumentCrud(session, current_user.project_.id)
+    c_crud = CollectionCrud(session, current_user.project_.id)
     document = d_crud.read_one(doc_id)
 
     remote = pick_service_for_documennt(
@@ -198,20 +202,21 @@ def remove_doc(
     "/{doc_id}/permanent",
     description=load_description("documents/permanent_delete.md"),
     response_model=APIResponse[Message],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def permanent_delete_doc(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     doc_id: UUID = FastPath(description="Document to permanently delete"),
 ):
     client = get_openai_client(
-        session, current_user.organization_id, current_user.project_id
+        session, current_user.organization_.id, current_user.project_.id
     )
     a_crud = OpenAIAssistantCrud(client)
     v_crud = OpenAIVectorStoreCrud(client)
-    d_crud = DocumentCrud(session, current_user.project_id)
-    c_crud = CollectionCrud(session, current_user.project_id)
-    storage = get_cloud_storage(session=session, project_id=current_user.project_id)
+    d_crud = DocumentCrud(session, current_user.project_.id)
+    c_crud = CollectionCrud(session, current_user.project_.id)
+    storage = get_cloud_storage(session=session, project_id=current_user.project_.id)
 
     document = d_crud.read_one(doc_id)
 
@@ -232,20 +237,21 @@ def permanent_delete_doc(
     "/{doc_id}",
     description=load_description("documents/info.md"),
     response_model=APIResponse[Union[DocumentPublic, TransformedDocumentPublic]],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def doc_info(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     doc_id: UUID = FastPath(description="Document to retrieve"),
     include_url: bool = Query(
         False, description="Include a signed URL to access the document"
     ),
 ):
-    crud = DocumentCrud(session, current_user.project_id)
+    crud = DocumentCrud(session, current_user.project_.id)
     document = crud.read_one(doc_id)
 
     storage = (
-        get_cloud_storage(session=session, project_id=current_user.project_id)
+        get_cloud_storage(session=session, project_id=current_user.project_.id)
         if include_url
         else None
     )
diff --git a/backend/app/api/routes/evaluation.py b/backend/app/api/routes/evaluation.py
index 058950d65..6175476db 100644
--- a/backend/app/api/routes/evaluation.py
+++ b/backend/app/api/routes/evaluation.py
@@ -4,9 +4,19 @@
 import re
 from pathlib import Path
 
-from fastapi import APIRouter, Body, File, Form, HTTPException, Query, UploadFile
+from fastapi import (
+    APIRouter,
+    Body,
+    File,
+    Form,
+    HTTPException,
+    Query,
+    UploadFile,
+    Depends,
+)
 
 from app.api.deps import AuthContextDep, SessionDep
+from app.api.permissions import Permission, require_permission
 from app.core.cloud import get_cloud_storage
 from app.crud.assistants import get_assistant_by_id
 from app.crud.evaluations import (
@@ -45,7 +55,7 @@
     "text/plain",  # Some systems report CSV as text/plain
 }
 
-router = APIRouter(tags=["evaluation"])
+router = APIRouter(tags=["Evaluation"])
 
 
 def _dataset_to_response(dataset) -> DatasetUploadResponse:
@@ -113,6 +123,7 @@ def sanitize_dataset_name(name: str) -> str:
     "/evaluations/datasets",
     description=load_description("evaluation/upload_dataset.md"),
     response_model=APIResponse[DatasetUploadResponse],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 async def upload_dataset(
     _session: SessionDep,
@@ -143,8 +154,8 @@ async def upload_dataset(
 
     logger.info(
         f"[upload_dataset] Uploading dataset | dataset={dataset_name} | "
-        f"duplication_factor={duplication_factor} | org_id={auth_context.organization.id} | "
-        f"project_id={auth_context.project.id}"
+        f"duplication_factor={duplication_factor} | org_id={auth_context.organization_.id} | "
+        f"project_id={auth_context.project_.id}"
     )
 
     # Security validation: Check file extension
@@ -234,7 +245,7 @@ async def upload_dataset(
     object_store_url = None
     try:
         storage = get_cloud_storage(
-            session=_session, project_id=auth_context.project.id
+            session=_session, project_id=auth_context.project_.id
         )
         object_store_url = upload_csv_to_object_store(
             storage=storage, csv_content=csv_content, dataset_name=dataset_name
@@ -260,8 +271,8 @@ async def upload_dataset(
         # Get Langfuse client
         langfuse = get_langfuse_client(
             session=_session,
-            org_id=auth_context.organization.id,
-            project_id=auth_context.project.id,
+            org_id=auth_context.organization_.id,
+            project_id=auth_context.project_.id,
         )
 
         # Upload to Langfuse
@@ -300,8 +311,8 @@ async def upload_dataset(
         dataset_metadata=metadata,
         object_store_url=object_store_url,
         langfuse_dataset_id=langfuse_dataset_id,
-        organization_id=auth_context.organization.id,
-        project_id=auth_context.project.id,
+        organization_id=auth_context.organization_.id,
+        project_id=auth_context.project_.id,
     )
 
     logger.info(
@@ -327,6 +338,7 @@ async def upload_dataset(
     "/evaluations/datasets",
     description=load_description("evaluation/list_datasets.md"),
     response_model=APIResponse[list[DatasetUploadResponse]],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def list_datasets_endpoint(
     _session: SessionDep,
@@ -340,8 +352,8 @@ def list_datasets_endpoint(
 
     datasets = list_datasets(
         session=_session,
-        organization_id=auth_context.organization.id,
-        project_id=auth_context.project.id,
+        organization_id=auth_context.organization_.id,
+        project_id=auth_context.project_.id,
         limit=limit,
         offset=offset,
     )
@@ -355,6 +367,7 @@ def list_datasets_endpoint(
     "/evaluations/datasets/{dataset_id}",
     description=load_description("evaluation/get_dataset.md"),
     response_model=APIResponse[DatasetUploadResponse],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def get_dataset(
     dataset_id: int,
@@ -363,15 +376,15 @@ def get_dataset(
 ) -> APIResponse[DatasetUploadResponse]:
     logger.info(
         f"[get_dataset] Fetching dataset | id={dataset_id} | "
-        f"org_id={auth_context.organization.id} | "
-        f"project_id={auth_context.project.id}"
+        f"org_id={auth_context.organization_.id} | "
+        f"project_id={auth_context.project_.id}"
     )
 
     dataset = get_dataset_by_id(
         session=_session,
         dataset_id=dataset_id,
-        organization_id=auth_context.organization.id,
-        project_id=auth_context.project.id,
+        organization_id=auth_context.organization_.id,
+        project_id=auth_context.project_.id,
     )
 
     if not dataset:
@@ -386,6 +399,7 @@ def get_dataset(
     "/evaluations/datasets/{dataset_id}",
     description=load_description("evaluation/delete_dataset.md"),
     response_model=APIResponse[dict],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def delete_dataset(
     dataset_id: int,
@@ -394,15 +408,15 @@ def delete_dataset(
 ) -> APIResponse[dict]:
     logger.info(
         f"[delete_dataset] Deleting dataset | id={dataset_id} | "
-        f"org_id={auth_context.organization.id} | "
-        f"project_id={auth_context.project.id}"
+        f"org_id={auth_context.organization_.id} | "
+        f"project_id={auth_context.project_.id}"
     )
 
     success, message = delete_dataset_crud(
         session=_session,
         dataset_id=dataset_id,
-        organization_id=auth_context.organization.id,
-        project_id=auth_context.project.id,
+        organization_id=auth_context.organization_.id,
+        project_id=auth_context.project_.id,
     )
 
     if not success:
@@ -422,6 +436,7 @@ def delete_dataset(
     "/evaluations",
     description=load_description("evaluation/create_evaluation.md"),
     response_model=APIResponse[EvaluationRunPublic],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def evaluate(
     _session: SessionDep,
@@ -439,7 +454,7 @@ def evaluate(
     logger.info(
         f"[evaluate] Starting evaluation | experiment_name={experiment_name} | "
         f"dataset_id={dataset_id} | "
-        f"org_id={auth_context.organization.id} | "
+        f"org_id={auth_context.organization_.id} | "
         f"assistant_id={assistant_id} | "
         f"config_keys={list(config.keys())}"
     )
@@ -448,8 +463,8 @@ def evaluate(
     dataset = get_dataset_by_id(
         session=_session,
         dataset_id=dataset_id,
-        organization_id=auth_context.organization.id,
-        project_id=auth_context.project.id,
+        organization_id=auth_context.organization_.id,
+        project_id=auth_context.project_.id,
     )
 
     if not dataset:
@@ -470,13 +485,13 @@ def evaluate(
     # Get API clients
     openai_client = get_openai_client(
         session=_session,
-        org_id=auth_context.organization.id,
-        project_id=auth_context.project.id,
+        org_id=auth_context.organization_.id,
+        project_id=auth_context.project_.id,
     )
     langfuse = get_langfuse_client(
         session=_session,
-        org_id=auth_context.organization.id,
-        project_id=auth_context.project.id,
+        org_id=auth_context.organization_.id,
+        project_id=auth_context.project_.id,
     )
 
     # Validate dataset has Langfuse ID (should have been set during dataset creation)
@@ -493,7 +508,7 @@ def evaluate(
         assistant = get_assistant_by_id(
             session=_session,
             assistant_id=assistant_id,
-            project_id=auth_context.project.id,
+            project_id=auth_context.project_.id,
         )
 
         if not assistant:
@@ -544,8 +559,8 @@ def evaluate(
         dataset_name=dataset_name,
         dataset_id=dataset_id,
         config=config,
-        organization_id=auth_context.organization.id,
-        project_id=auth_context.project.id,
+        organization_id=auth_context.organization_.id,
+        project_id=auth_context.project_.id,
     )
 
     # Start the batch evaluation
@@ -579,6 +594,7 @@ def evaluate(
     "/evaluations",
     description=load_description("evaluation/list_evaluations.md"),
     response_model=APIResponse[list[EvaluationRunPublic]],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def list_evaluation_runs(
     _session: SessionDep,
@@ -588,15 +604,15 @@ def list_evaluation_runs(
 ) -> APIResponse[list[EvaluationRunPublic]]:
     logger.info(
         f"[list_evaluation_runs] Listing evaluation runs | "
-        f"org_id={auth_context.organization.id} | "
-        f"project_id={auth_context.project.id} | limit={limit} | offset={offset}"
+        f"org_id={auth_context.organization_.id} | "
+        f"project_id={auth_context.project_.id} | limit={limit} | offset={offset}"
     )
 
     return APIResponse.success_response(
         data=list_evaluation_runs_crud(
             session=_session,
-            organization_id=auth_context.organization.id,
-            project_id=auth_context.project.id,
+            organization_id=auth_context.organization_.id,
+            project_id=auth_context.project_.id,
             limit=limit,
             offset=offset,
         )
@@ -607,6 +623,7 @@ def list_evaluation_runs(
     "/evaluations/{evaluation_id}",
     description=load_description("evaluation/get_evaluation.md"),
     response_model=APIResponse[EvaluationRunPublic],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def get_evaluation_run_status(
     evaluation_id: int,
@@ -632,8 +649,8 @@ def get_evaluation_run_status(
     logger.info(
         f"[get_evaluation_run_status] Fetching status for evaluation run | "
         f"evaluation_id={evaluation_id} | "
-        f"org_id={auth_context.organization.id} | "
-        f"project_id={auth_context.project.id} | "
+        f"org_id={auth_context.organization_.id} | "
+        f"project_id={auth_context.project_.id} | "
         f"get_trace_info={get_trace_info} | "
         f"resync_score={resync_score}"
     )
@@ -647,8 +664,8 @@ def get_evaluation_run_status(
     eval_run = get_evaluation_run_by_id(
         session=_session,
         evaluation_id=evaluation_id,
-        organization_id=auth_context.organization.id,
-        project_id=auth_context.project.id,
+        organization_id=auth_context.organization_.id,
+        project_id=auth_context.project_.id,
     )
 
     if not eval_run:
@@ -677,16 +694,16 @@ def get_evaluation_run_status(
         # Get Langfuse client (needs session for credentials lookup)
         langfuse = get_langfuse_client(
             session=_session,
-            org_id=auth_context.organization.id,
-            project_id=auth_context.project.id,
+            org_id=auth_context.organization_.id,
+            project_id=auth_context.project_.id,
         )
 
         # Capture data needed for Langfuse fetch and DB update
         dataset_name = eval_run.dataset_name
         run_name = eval_run.run_name
         eval_run_id = eval_run.id
-        org_id = auth_context.organization.id
-        project_id = auth_context.project.id
+        org_id = auth_context.organization_.id
+        project_id = auth_context.project_.id
 
         # Session is no longer needed - slow Langfuse API calls happen here
         # without holding the DB connection
diff --git a/backend/app/api/routes/fine_tuning.py b/backend/app/api/routes/fine_tuning.py
index 33e14cbe7..69f8951df 100644
--- a/backend/app/api/routes/fine_tuning.py
+++ b/backend/app/api/routes/fine_tuning.py
@@ -6,7 +6,15 @@
 
 import openai
 from sqlmodel import Session
-from fastapi import APIRouter, HTTPException, BackgroundTasks, File, Form, UploadFile
+from fastapi import (
+    APIRouter,
+    HTTPException,
+    BackgroundTasks,
+    File,
+    Form,
+    UploadFile,
+    Depends,
+)
 
 from app.models import (
     FineTuningJobCreate,
@@ -35,13 +43,14 @@
     fetch_active_model_evals,
 )
 from app.core.db import engine
-from app.api.deps import CurrentUserOrgProject, SessionDep
+from app.api.deps import AuthContextDep, SessionDep
+from app.api.permissions import Permission, require_permission
 from app.core.finetune.preprocessing import DataPreprocessor
 from app.api.routes.model_evaluation import run_model_evaluation
 
 logger = logging.getLogger(__name__)
 
-router = APIRouter(prefix="/fine_tuning", tags=["fine_tuning"])
+router = APIRouter(prefix="/fine_tuning", tags=["Fine Tuning"])
 
 
 OPENAI_TO_INTERNAL_STATUS = {
@@ -57,11 +66,11 @@
 def process_fine_tuning_job(
     job_id: int,
     ratio: float,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     request: FineTuningJobCreate,
 ):
     start_time = time.time()
-    project_id = current_user.project_id
+    project_id = current_user.project_.id
     fine_tune = None
 
     logger.info(
@@ -72,12 +81,12 @@ def process_fine_tuning_job(
             fine_tune = fetch_by_id(session, job_id, project_id)
 
             client = get_openai_client(
-                session, current_user.organization_id, project_id
+                session, current_user.organization_.id, project_id
             )
             storage = get_cloud_storage(
-                session=session, project_id=current_user.project_id
+                session=session, project_id=current_user.project_.id
             )
-            document_crud = DocumentCrud(session, current_user.project_id)
+            document_crud = DocumentCrud(session, current_user.project_.id)
             document = document_crud.read_one(request.document_id)
             preprocessor = DataPreprocessor(
                 document, storage, ratio, request.system_prompt
@@ -184,10 +193,11 @@ def process_fine_tuning_job(
     "/fine_tune",
     description=load_description("fine_tuning/create.md"),
     response_model=APIResponse,
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 async def fine_tune_from_CSV(
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
     background_tasks: BackgroundTasks,
     file: UploadFile = File(..., description="CSV file to use for fine-tuning"),
     base_model: str = Form(...),
@@ -207,18 +217,18 @@ async def fine_tune_from_CSV(
     get_openai_client(  # Used here only to validate the user's OpenAI key;
         # the actual client is re-initialized separately inside the background task
         session,
-        current_user.organization_id,
-        current_user.project_id,
+        current_user.organization_.id,
+        current_user.project_.id,
     )
 
     # Upload the file to storage and create document
     # ToDo: create a helper function and then use it rather than doing things in router
-    storage = get_cloud_storage(session=session, project_id=current_user.project_id)
+    storage = get_cloud_storage(session=session, project_id=current_user.project_.id)
     document_id = uuid4()
     object_store_url = storage.put(file, Path(str(document_id)))
 
     # Create document in database
-    document_crud = DocumentCrud(session, current_user.project_id)
+    document_crud = DocumentCrud(session, current_user.project_.id)
     document = Document(
         id=document_id,
         fname=file.filename,
@@ -241,8 +251,8 @@ async def fine_tune_from_CSV(
             session=session,
             request=request,
             split_ratio=ratio,
-            organization_id=current_user.organization_id,
-            project_id=current_user.project_id,
+            organization_id=current_user.organization_.id,
+            project_id=current_user.project_.id,
         )
         results.append((job, created))
 
@@ -253,7 +263,7 @@ async def fine_tune_from_CSV(
 
     if not results:
         logger.error(
-            f"[fine_tune_from_CSV]All fine-tuning job creations failed for document_id={request.document_id}, project_id={current_user.project_id}"
+            f"[fine_tune_from_CSV]All fine-tuning job creations failed for document_id={request.document_id}, project_id={current_user.project_.id}"
         )
         raise HTTPException(
             status_code=500, detail="Failed to create or fetch any fine-tuning jobs."
@@ -286,17 +296,18 @@ async def fine_tune_from_CSV(
     "/{fine_tuning_id}/refresh",
     description=load_description("fine_tuning/retrieve.md"),
     response_model=APIResponse[FineTuningJobPublic],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def refresh_fine_tune_status(
     fine_tuning_id: int,
     background_tasks: BackgroundTasks,
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
 ):
-    project_id = current_user.project_id
+    project_id = current_user.project_.id
     job = fetch_by_id(session, fine_tuning_id, project_id)
-    client = get_openai_client(session, current_user.organization_id, project_id)
-    storage = get_cloud_storage(session=session, project_id=current_user.project_id)
+    client = get_openai_client(session, current_user.organization_.id, project_id)
+    storage = get_cloud_storage(session=session, project_id=current_user.project_.id)
 
     if job.provider_job_id is not None:
         try:
@@ -358,7 +369,7 @@ def refresh_fine_tune_status(
                     session=session,
                     request=ModelEvaluationBase(fine_tuning_id=fine_tuning_id),
                     project_id=project_id,
-                    organization_id=current_user.organization_id,
+                    organization_id=current_user.organization_.id,
                     status=ModelEvaluationStatus.pending,
                 )
 
@@ -395,12 +406,13 @@ def refresh_fine_tune_status(
     "/{document_id}",
     description="Retrieves all fine-tuning jobs associated with the given document ID for the current project",
     response_model=APIResponse[list[FineTuningJobPublic]],
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def retrieve_jobs_by_document(
-    document_id: UUID, session: SessionDep, current_user: CurrentUserOrgProject
+    document_id: UUID, session: SessionDep, current_user: AuthContextDep
 ):
-    storage = get_cloud_storage(session=session, project_id=current_user.project_id)
-    project_id = current_user.project_id
+    storage = get_cloud_storage(session=session, project_id=current_user.project_.id)
+    project_id = current_user.project_.id
     jobs = fetch_by_document_id(session, document_id, project_id)
     if not jobs:
         logger.warning(
diff --git a/backend/app/api/routes/llm.py b/backend/app/api/routes/llm.py
index e244b2258..93b542216 100644
--- a/backend/app/api/routes/llm.py
+++ b/backend/app/api/routes/llm.py
@@ -1,8 +1,9 @@
 import logging
 
-from fastapi import APIRouter
+from fastapi import APIRouter, Depends
 
 from app.api.deps import AuthContextDep, SessionDep
+from app.api.permissions import Permission, require_permission
 from app.models import LLMCallRequest, LLMCallResponse, Message
 from app.services.llm.jobs import start_job
 from app.utils import APIResponse, validate_callback_url, load_description
@@ -35,6 +36,7 @@ def llm_callback_notification(body: APIResponse[LLMCallResponse]):
     description=load_description("llm/llm_call.md"),
     response_model=APIResponse[Message],
     callbacks=llm_callback_router.routes,
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def llm_call(
     _current_user: AuthContextDep, _session: SessionDep, request: LLMCallRequest
@@ -42,8 +44,8 @@ def llm_call(
     """
     Endpoint to initiate an LLM call as a background job.
     """
-    project_id = _current_user.project.id
-    organization_id = _current_user.organization.id
+    project_id = _current_user.project_.id
+    organization_id = _current_user.organization_.id
 
     if request.callback_url:
         validate_callback_url(str(request.callback_url))
diff --git a/backend/app/api/routes/login.py b/backend/app/api/routes/login.py
index 357285c07..704a5e8d7 100644
--- a/backend/app/api/routes/login.py
+++ b/backend/app/api/routes/login.py
@@ -5,7 +5,8 @@
 from fastapi.responses import HTMLResponse
 from fastapi.security import OAuth2PasswordRequestForm
 
-from app.api.deps import CurrentUser, SessionDep, get_current_active_superuser
+from app.api.deps import AuthContextDep, SessionDep
+from app.api.permissions import Permission, require_permission
 from app.core import security
 from app.core.config import settings
 from app.core.security import get_password_hash
@@ -18,7 +19,7 @@
     verify_password_reset_token,
 )
 
-router = APIRouter(tags=["login"])
+router = APIRouter(tags=["Login"])
 
 
 @router.post("/login/access-token")
@@ -51,11 +52,11 @@ def login_access_token(
 
 
 @router.post("/login/test-token", response_model=UserPublic, include_in_schema=False)
-def test_token(current_user: CurrentUser) -> Any:
+def test_token(current_user: AuthContextDep) -> Any:
     """
     Test access token
     """
-    return current_user
+    return current_user.user
 
 
 @router.post("/password-recovery/{email}", include_in_schema=False)
@@ -107,7 +108,7 @@ def reset_password(session: SessionDep, body: NewPassword) -> Message:
 
 @router.post(
     "/password-recovery-html-content/{email}",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_class=HTMLResponse,
     include_in_schema=False,
 )
diff --git a/backend/app/api/routes/model_evaluation.py b/backend/app/api/routes/model_evaluation.py
index b38172ddf..c22c78508 100644
--- a/backend/app/api/routes/model_evaluation.py
+++ b/backend/app/api/routes/model_evaluation.py
@@ -2,9 +2,8 @@
 import time
 from uuid import UUID
 
-from fastapi import APIRouter, HTTPException, BackgroundTasks
+from fastapi import APIRouter, HTTPException, BackgroundTasks, Depends
 from sqlmodel import Session
-from openai import OpenAI
 
 from app.crud import (
     fetch_by_id,
@@ -24,13 +23,14 @@
 from app.core.db import engine
 from app.core.cloud import get_cloud_storage
 from app.core.finetune.evaluation import ModelEvaluator
-from app.utils import get_openai_client, APIResponse
-from app.api.deps import CurrentUserOrgProject, SessionDep
+from app.utils import get_openai_client, APIResponse, load_description
+from app.api.deps import AuthContextDep, SessionDep
+from app.api.permissions import Permission, require_permission
 
 
 logger = logging.getLogger(__name__)
 
-router = APIRouter(prefix="/model_evaluation", tags=["model_evaluation"])
+router = APIRouter(prefix="/model_evaluation", tags=["Model Evaluation"])
 
 
 def attach_prediction_file_url(model_obj, storage):
@@ -48,24 +48,24 @@ def attach_prediction_file_url(model_obj, storage):
 
 def run_model_evaluation(
     eval_id: int,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
 ):
     start_time = time.time()
     logger.info(
-        f"[run_model_evaluation] Starting | eval_id={eval_id}, project_id={current_user.project_id}"
+        f"[run_model_evaluation] Starting | eval_id={eval_id}, project_id={current_user.project_.id}"
     )
 
     with Session(engine) as db:
         client = get_openai_client(
-            db, current_user.organization_id, current_user.project_id
+            db, current_user.organization_.id, current_user.project_.id
         )
-        storage = get_cloud_storage(session=db, project_id=current_user.project_id)
+        storage = get_cloud_storage(session=db, project_id=current_user.project_.id)
 
         try:
             model_eval = update_model_eval(
                 session=db,
                 eval_id=eval_id,
-                project_id=current_user.project_id,
+                project_id=current_user.project_.id,
                 update=ModelEvaluationUpdate(status=ModelEvaluationStatus.running),
             )
 
@@ -81,7 +81,7 @@ def run_model_evaluation(
             update_model_eval(
                 session=db,
                 eval_id=eval_id,
-                project_id=current_user.project_id,
+                project_id=current_user.project_.id,
                 update=ModelEvaluationUpdate(
                     score=result["evaluation_score"],
                     prediction_data_s3_object=result["prediction_data_s3_object"],
@@ -91,19 +91,19 @@ def run_model_evaluation(
 
             elapsed = time.time() - start_time
             logger.info(
-                f"[run_model_evaluation] Completed | eval_id={eval_id}, project_id={current_user.project_id}, elapsed={elapsed:.2f}s"
+                f"[run_model_evaluation] Completed | eval_id={eval_id}, project_id={current_user.project_.id}, elapsed={elapsed:.2f}s"
             )
 
         except Exception as e:
             error_msg = str(e)
             logger.error(
-                f"[run_model_evaluation] Failed | eval_id={eval_id}, project_id={current_user.project_id}: {e}"
+                f"[run_model_evaluation] Failed | eval_id={eval_id}, project_id={current_user.project_.id}: {e}"
             )
             db.rollback()
             update_model_eval(
                 session=db,
                 eval_id=eval_id,
-                project_id=current_user.project_id,
+                project_id=current_user.project_.id,
                 update=ModelEvaluationUpdate(
                     status=ModelEvaluationStatus.failed,
                     error_message="failed during background job processing:"
@@ -112,12 +112,17 @@ def run_model_evaluation(
             )
 
 
-@router.post("/evaluate_models/", response_model=APIResponse)
+@router.post(
+    "/evaluate_models/",
+    response_model=APIResponse,
+    description=load_description("model_evaluation/evaluate.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
+)
 def evaluate_models(
     request: ModelEvaluationCreate,
     background_tasks: BackgroundTasks,
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
 ):
     """
     Start evaluations for one or more fine-tuning jobs.
@@ -133,27 +138,27 @@ def evaluate_models(
         APIResponse with the created/active evaluation records and a success message.
     """
     client = get_openai_client(
-        session, current_user.organization_id, current_user.project_id
+        session, current_user.organization_.id, current_user.project_.id
     )  # keeping this here for checking if the user's validated OpenAI key is present or not,
     # even though the client will be initialized separately inside the background task
 
     if not request.fine_tuning_ids:
         logger.error(
-            f"[evaluate_model] No fine tuning IDs provided | project_id:{current_user.project_id}"
+            f"[evaluate_model] No fine tuning IDs provided | project_id:{current_user.project_.id}"
         )
         raise HTTPException(status_code=400, detail="No fine-tuned job IDs provided")
 
     evaluations: list[ModelEvaluationPublic] = []
 
     for job_id in request.fine_tuning_ids:
-        fine_tuning_job = fetch_by_id(session, job_id, current_user.project_id)
+        fine_tuning_job = fetch_by_id(session, job_id, current_user.project_.id)
         active_evaluations = fetch_active_model_evals(
-            session, job_id, current_user.project_id
+            session, job_id, current_user.project_.id
         )
 
         if active_evaluations:
             logger.info(
-                f"[evaluate_model] Skipping creation for {job_id}. Active evaluation exists, project_id:{current_user.project_id}"
+                f"[evaluate_model] Skipping creation for {job_id}. Active evaluation exists, project_id:{current_user.project_.id}"
             )
             evaluations.extend(
                 ModelEvaluationPublic.model_validate(ev) for ev in active_evaluations
@@ -163,15 +168,15 @@ def evaluate_models(
         model_eval = create_model_evaluation(
             session=session,
             request=ModelEvaluationBase(fine_tuning_id=fine_tuning_job.id),
-            project_id=current_user.project_id,
-            organization_id=current_user.organization_id,
+            project_id=current_user.project_.id,
+            organization_id=current_user.organization_.id,
             status=ModelEvaluationStatus.pending,
         )
 
         evaluations.append(ModelEvaluationPublic.model_validate(model_eval))
 
         logger.info(
-            f"[evaluate_model] Created evaluation for fine_tuning_id {job_id} with eval ID={model_eval.id}, project_id:{current_user.project_id}"
+            f"[evaluate_model] Created evaluation for fine_tuning_id {job_id} with eval ID={model_eval.id}, project_id:{current_user.project_.id}"
         )
 
         background_tasks.add_task(run_model_evaluation, model_eval.id, current_user)
@@ -196,11 +201,13 @@ def evaluate_models(
     "/{document_id}/top_model",
     response_model=APIResponse[ModelEvaluationPublic],
     response_model_exclude_none=True,
+    description=load_description("model_evaluation/get_top_model.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def get_top_model_by_doc_id(
     document_id: UUID,
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
 ):
     """
     Return the top model trained on the given document_id, ranked by
@@ -208,11 +215,13 @@ def get_top_model_by_doc_id(
     """
     logger.info(
         f"[get_top_model_by_doc_id] Fetching top model for document_id={document_id}, "
-        f"project_id={current_user.project_id}"
+        f"project_id={current_user.project_.id}"
     )
 
-    top_model = fetch_top_model_by_doc_id(session, document_id, current_user.project_id)
-    storage = get_cloud_storage(session=session, project_id=current_user.project_id)
+    top_model = fetch_top_model_by_doc_id(
+        session, document_id, current_user.project_.id
+    )
+    storage = get_cloud_storage(session=session, project_id=current_user.project_.id)
 
     top_model = attach_prediction_file_url(top_model, storage)
 
@@ -223,22 +232,24 @@ def get_top_model_by_doc_id(
     "/{document_id}",
     response_model=APIResponse[list[ModelEvaluationPublic]],
     response_model_exclude_none=True,
+    description=load_description("model_evaluation/list_by_document.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def get_evaluations_by_doc_id(
     document_id: UUID,
     session: SessionDep,
-    current_user: CurrentUserOrgProject,
+    current_user: AuthContextDep,
 ):
     """
     Return all model evaluations for the given document_id within the current project.
     """
     logger.info(
         f"[get_evaluations_by_doc_id] Fetching evaluations for document_id={document_id}, "
-        f"project_id={current_user.project_id}"
+        f"project_id={current_user.project_.id}"
     )
 
-    evaluations = fetch_eval_by_doc_id(session, document_id, current_user.project_id)
-    storage = get_cloud_storage(session=session, project_id=current_user.project_id)
+    evaluations = fetch_eval_by_doc_id(session, document_id, current_user.project_.id)
+    storage = get_cloud_storage(session=session, project_id=current_user.project_.id)
 
     updated_evaluations = [
         attach_prediction_file_url(ev, storage) for ev in evaluations
diff --git a/backend/app/api/routes/onboarding.py b/backend/app/api/routes/onboarding.py
index f081c4010..c5f7f1231 100644
--- a/backend/app/api/routes/onboarding.py
+++ b/backend/app/api/routes/onboarding.py
@@ -1,14 +1,12 @@
 from fastapi import APIRouter, Depends
 
-from app.api.deps import (
-    SessionDep,
-    get_current_active_superuser,
-)
+from app.api.deps import SessionDep
+from app.api.permissions import Permission, require_permission
 from app.crud import onboard_project
 from app.models import OnboardingRequest, OnboardingResponse, User
 from app.utils import APIResponse, load_description
 
-router = APIRouter(tags=["onboarding"])
+router = APIRouter(tags=["Onboarding"])
 
 
 @router.post(
@@ -16,11 +14,11 @@
     response_model=APIResponse[OnboardingResponse],
     status_code=201,
     description=load_description("onboarding/onboarding.md"),
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
 )
 def onboard_project_route(
     onboard_in: OnboardingRequest,
     session: SessionDep,
-    current_user: User = Depends(get_current_active_superuser),
 ):
     response = onboard_project(session=session, onboard_in=onboard_in)
 
diff --git a/backend/app/api/routes/openai_conversation.py b/backend/app/api/routes/openai_conversation.py
index 71f0c7304..de0be838f 100644
--- a/backend/app/api/routes/openai_conversation.py
+++ b/backend/app/api/routes/openai_conversation.py
@@ -1,9 +1,9 @@
 from typing import Annotated
 
 from fastapi import APIRouter, Depends, Path, HTTPException, Query
-from sqlmodel import Session
 
-from app.api.deps import get_db, get_current_user_org_project
+from app.api.deps import AuthContextDep, SessionDep
+from app.api.permissions import Permission, require_permission
 from app.crud import (
     get_conversation_by_id,
     get_conversation_by_response_id,
@@ -13,10 +13,9 @@
     delete_conversation,
 )
 from app.models import (
-    UserProjectOrg,
     OpenAIConversationPublic,
 )
-from app.utils import APIResponse
+from app.utils import APIResponse, load_description
 
 router = APIRouter(prefix="/openai-conversation", tags=["OpenAI Conversations"])
 
@@ -25,17 +24,19 @@
     "/{conversation_id}",
     response_model=APIResponse[OpenAIConversationPublic],
     summary="Get a single conversation by its ID",
+    description=load_description("openai_conversation/get.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def get_conversation_route(
-    conversation_id: int = Path(..., description="The conversation ID to fetch"),
-    session: Session = Depends(get_db),
-    current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    conversation_id: Annotated[int, Path(description="The conversation ID to fetch")],
+    session: SessionDep,
+    current_user: AuthContextDep,
 ):
     """
     Fetch a single conversation by its ID.
     """
     conversation = get_conversation_by_id(
-        session, conversation_id, current_user.project_id
+        session, conversation_id, current_user.project_.id
     )
     if not conversation:
         raise HTTPException(
@@ -48,17 +49,19 @@ def get_conversation_route(
     "/response/{response_id}",
     response_model=APIResponse[OpenAIConversationPublic],
     summary="Get a conversation by its OpenAI response ID",
+    description=load_description("openai_conversation/get_by_response_id.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def get_conversation_by_response_id_route(
-    response_id: str = Path(..., description="The OpenAI response ID to fetch"),
-    session: Session = Depends(get_db),
-    current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    response_id: Annotated[str, Path(description="The OpenAI response ID to fetch")],
+    session: SessionDep,
+    current_user: AuthContextDep,
 ):
     """
     Fetch a conversation by its OpenAI response ID.
     """
     conversation = get_conversation_by_response_id(
-        session, response_id, current_user.project_id
+        session, response_id, current_user.project_.id
     )
     if not conversation:
         raise HTTPException(
@@ -72,19 +75,21 @@ def get_conversation_by_response_id_route(
     "/ancestor/{ancestor_response_id}",
     response_model=APIResponse[OpenAIConversationPublic],
     summary="Get a conversation by its ancestor response ID",
+    description=load_description("openai_conversation/get_by_ancestor_id.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def get_conversation_by_ancestor_id_route(
-    ancestor_response_id: str = Path(
-        ..., description="The ancestor response ID to fetch"
-    ),
-    session: Session = Depends(get_db),
-    current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    ancestor_response_id: Annotated[
+        str, Path(description="The ancestor response ID to fetch")
+    ],
+    session: SessionDep,
+    current_user: AuthContextDep,
 ):
     """
     Fetch a conversation by its ancestor response ID.
     """
     conversation = get_conversation_by_ancestor_id(
-        session, ancestor_response_id, current_user.project_id
+        session, ancestor_response_id, current_user.project_.id
     )
     if not conversation:
         raise HTTPException(
@@ -98,10 +103,12 @@ def get_conversation_by_ancestor_id_route(
     "/",
     response_model=APIResponse[list[OpenAIConversationPublic]],
     summary="List all conversations in the current project",
+    description=load_description("openai_conversation/list.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
 )
 def list_conversations_route(
-    session: Session = Depends(get_db),
-    current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    session: SessionDep,
+    current_user: AuthContextDep,
     skip: int = Query(0, ge=0, description="How many items to skip"),
     limit: int = Query(100, ge=1, le=100, description="Maximum items to return"),
 ):
@@ -110,7 +117,7 @@ def list_conversations_route(
     """
     conversations = get_conversations_by_project(
         session=session,
-        project_id=current_user.project_id,
+        project_id=current_user.project_.id,
         skip=skip,  # ← Pagination offset
         limit=limit,  # ← Page size
     )
@@ -118,7 +125,7 @@ def list_conversations_route(
     # Get total count for pagination metadata
     total = get_conversations_count_by_project(
         session=session,
-        project_id=current_user.project_id,
+        project_id=current_user.project_.id,
     )
 
     return APIResponse.success_response(
@@ -126,11 +133,16 @@ def list_conversations_route(
     )
 
 
-@router.delete("/{conversation_id}", response_model=APIResponse)
+@router.delete(
+    "/{conversation_id}",
+    response_model=APIResponse,
+    description=load_description("openai_conversation/delete.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
+)
 def delete_conversation_route(
     conversation_id: Annotated[int, Path(description="Conversation ID to delete")],
-    session: Session = Depends(get_db),
-    current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    session: SessionDep,
+    current_user: AuthContextDep,
 ):
     """
     Soft delete a conversation by marking it as deleted.
@@ -138,7 +150,7 @@ def delete_conversation_route(
     deleted_conversation = delete_conversation(
         session=session,
         conversation_id=conversation_id,
-        project_id=current_user.project_id,
+        project_id=current_user.project_.id,
     )
 
     if not deleted_conversation:
diff --git a/backend/app/api/routes/organization.py b/backend/app/api/routes/organization.py
index 29f6f62aa..eb853921b 100644
--- a/backend/app/api/routes/organization.py
+++ b/backend/app/api/routes/organization.py
@@ -1,9 +1,9 @@
 import logging
-from typing import Any, List
+from typing import List
 
 from fastapi import APIRouter, Depends, HTTPException
 from sqlalchemy import func
-from sqlmodel import Session, select
+from sqlmodel import select
 
 from app.models import (
     Organization,
@@ -11,23 +11,21 @@
     OrganizationUpdate,
     OrganizationPublic,
 )
-from app.api.deps import (
-    CurrentUser,
-    SessionDep,
-    get_current_active_superuser,
-)
+from app.api.deps import SessionDep
+from app.api.permissions import Permission, require_permission
 from app.crud.organization import create_organization, get_organization_by_id
-from app.utils import APIResponse
+from app.utils import APIResponse, load_description
 
 logger = logging.getLogger(__name__)
-router = APIRouter(prefix="/organizations", tags=["organizations"])
+router = APIRouter(prefix="/organizations", tags=["Organizations"])
 
 
 # Retrieve organizations
 @router.get(
     "/",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=APIResponse[List[OrganizationPublic]],
+    description=load_description("organization/list.md"),
 )
 def read_organizations(session: SessionDep, skip: int = 0, limit: int = 100):
     count_statement = select(func.count()).select_from(Organization)
@@ -42,8 +40,9 @@ def read_organizations(session: SessionDep, skip: int = 0, limit: int = 100):
 # Create a new organization
 @router.post(
     "/",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=APIResponse[OrganizationPublic],
+    description=load_description("organization/create.md"),
 )
 def create_new_organization(*, session: SessionDep, org_in: OrganizationCreate):
     new_org = create_organization(session=session, org_create=org_in)
@@ -52,8 +51,9 @@ def create_new_organization(*, session: SessionDep, org_in: OrganizationCreate):
 
 @router.get(
     "/{org_id}",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=APIResponse[OrganizationPublic],
+    description=load_description("organization/get.md"),
 )
 def read_organization(*, session: SessionDep, org_id: int):
     """
@@ -69,8 +69,9 @@ def read_organization(*, session: SessionDep, org_id: int):
 # Update an organization
 @router.patch(
     "/{org_id}",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=APIResponse[OrganizationPublic],
+    description=load_description("organization/update.md"),
 )
 def update_organization(
     *, session: SessionDep, org_id: int, org_in: OrganizationUpdate
@@ -97,9 +98,10 @@ def update_organization(
 # Delete an organization
 @router.delete(
     "/{org_id}",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=APIResponse[None],
     include_in_schema=False,
+    description=load_description("organization/delete.md"),
 )
 def delete_organization(session: SessionDep, org_id: int):
     org = get_organization_by_id(session=session, org_id=org_id)
diff --git a/backend/app/api/routes/project.py b/backend/app/api/routes/project.py
index 7be763a75..8a114930d 100644
--- a/backend/app/api/routes/project.py
+++ b/backend/app/api/routes/project.py
@@ -1,30 +1,29 @@
 import logging
-from typing import Any, List
+from typing import List
 
 from fastapi import APIRouter, Depends, HTTPException, Query
 from sqlalchemy import func
-from sqlmodel import Session, select
+from sqlmodel import select
 
 from app.models import Project, ProjectCreate, ProjectUpdate, ProjectPublic
-from app.api.deps import (
-    SessionDep,
-    get_current_active_superuser,
-)
+from app.api.deps import SessionDep
+from app.api.permissions import Permission, require_permission
 from app.crud.project import (
     create_project,
     get_project_by_id,
 )
-from app.utils import APIResponse
+from app.utils import APIResponse, load_description
 
 logger = logging.getLogger(__name__)
-router = APIRouter(prefix="/projects", tags=["projects"])
+router = APIRouter(prefix="/projects", tags=["Projects"])
 
 
 # Retrieve projects
 @router.get(
     "/",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=APIResponse[List[ProjectPublic]],
+    description=load_description("projects/list.md"),
 )
 def read_projects(
     session: SessionDep,
@@ -43,8 +42,9 @@ def read_projects(
 # Create a new project
 @router.post(
     "/",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=APIResponse[ProjectPublic],
+    description=load_description("projects/create.md"),
 )
 def create_new_project(*, session: SessionDep, project_in: ProjectCreate):
     project = create_project(session=session, project_create=project_in)
@@ -53,8 +53,9 @@ def create_new_project(*, session: SessionDep, project_in: ProjectCreate):
 
 @router.get(
     "/{project_id}",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=APIResponse[ProjectPublic],
+    description=load_description("projects/get.md"),
 )
 def read_project(*, session: SessionDep, project_id: int):
     """
@@ -70,8 +71,9 @@ def read_project(*, session: SessionDep, project_id: int):
 # Update a project
 @router.patch(
     "/{project_id}",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=APIResponse[ProjectPublic],
+    description=load_description("projects/update.md"),
 )
 def update_project(*, session: SessionDep, project_id: int, project_in: ProjectUpdate):
     project = get_project_by_id(session=session, project_id=project_id)
@@ -94,8 +96,9 @@ def update_project(*, session: SessionDep, project_id: int, project_in: ProjectU
 # Delete a project
 @router.delete(
     "/{project_id}",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     include_in_schema=False,
+    description=load_description("projects/delete.md"),
 )
 def delete_project(session: SessionDep, project_id: int):
     project = get_project_by_id(session=session, project_id=project_id)
diff --git a/backend/app/api/routes/responses.py b/backend/app/api/routes/responses.py
index c84659ee0..6426270bc 100644
--- a/backend/app/api/routes/responses.py
+++ b/backend/app/api/routes/responses.py
@@ -3,9 +3,9 @@
 import openai
 from fastapi import APIRouter, Depends, HTTPException
 from fastapi.responses import JSONResponse
-from sqlmodel import Session
 
-from app.api.deps import get_db, get_current_user_org_project
+from app.api.deps import AuthContextDep, SessionDep
+from app.api.permissions import Permission, require_permission
 from app.core.langfuse.langfuse import LangfuseTracer
 from app.crud.credentials import get_provider_credential
 from app.models import (
@@ -14,28 +14,37 @@
     ResponsesAPIRequest,
     ResponseJobStatus,
     ResponsesSyncAPIRequest,
-    UserProjectOrg,
 )
 from app.services.response.jobs import start_job
 from app.services.response.response import get_file_search_results
 from app.services.response.callbacks import get_additional_data
-from app.utils import APIResponse, get_openai_client, handle_openai_error, mask_string
+from app.utils import (
+    APIResponse,
+    get_openai_client,
+    handle_openai_error,
+    load_description,
+)
 
 
 logger = logging.getLogger(__name__)
-router = APIRouter(tags=["responses"])
+router = APIRouter(tags=["Responses"])
 
 
-@router.post("/responses", response_model=APIResponse[ResponseJobStatus])
+@router.post(
+    "/responses",
+    response_model=APIResponse[ResponseJobStatus],
+    description=load_description("responses/create_async.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
+)
 async def responses(
     request: ResponsesAPIRequest,
-    _session: Session = Depends(get_db),
-    _current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    _session: SessionDep,
+    _current_user: AuthContextDep,
 ):
     """Asynchronous endpoint that processes requests using Celery."""
     project_id, organization_id = (
-        _current_user.project_id,
-        _current_user.organization_id,
+        _current_user.project_.id,
+        _current_user.organization_.id,
     )
 
     start_job(
@@ -56,16 +65,21 @@ async def responses(
     return APIResponse.success_response(data=response)
 
 
-@router.post("/responses/sync", response_model=APIResponse[CallbackResponse])
+@router.post(
+    "/responses/sync",
+    response_model=APIResponse[CallbackResponse],
+    description=load_description("responses/create_sync.md"),
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
+)
 async def responses_sync(
     request: ResponsesSyncAPIRequest,
-    _session: Session = Depends(get_db),
-    _current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    _session: SessionDep,
+    _current_user: AuthContextDep,
 ):
     """Synchronous endpoint for benchmarking OpenAI responses API with Langfuse tracing."""
     project_id, organization_id = (
-        _current_user.project_id,
-        _current_user.organization_id,
+        _current_user.project_.id,
+        _current_user.organization_.id,
     )
 
     try:
diff --git a/backend/app/api/routes/threads.py b/backend/app/api/routes/threads.py
index 9a38b8637..2ea34be61 100644
--- a/backend/app/api/routes/threads.py
+++ b/backend/app/api/routes/threads.py
@@ -5,12 +5,12 @@
 from fastapi import APIRouter, BackgroundTasks, Depends, HTTPException
 from openai import OpenAI
 from pydantic import BaseModel, Field
-from sqlmodel import Session
 from typing import Optional
 
-from app.api.deps import get_current_user_org, get_db, get_current_user_org_project
+from app.api.deps import AuthContextDep, SessionDep
+from app.api.permissions import Permission, require_permission
 from app.core import logging, settings
-from app.models import UserOrganization, OpenAIThreadCreate, UserProjectOrg
+from app.models import OpenAIThreadCreate
 from app.crud import upsert_thread_result, get_thread_result
 from app.utils import APIResponse, mask_string
 from app.crud.credentials import get_provider_credential
@@ -18,7 +18,7 @@
 from app.core.langfuse.langfuse import LangfuseTracer
 
 logger = logging.getLogger(__name__)
-router = APIRouter(tags=["threads"])
+router = APIRouter(tags=["Threads"])
 
 
 class StartThreadRequest(BaseModel):
@@ -242,7 +242,7 @@ def process_run(request: dict, client: OpenAI, tracer: LangfuseTracer):
     send_callback(request["callback_url"], response)
 
 
-def poll_run_and_prepare_response(request: dict, client: OpenAI, db: Session):
+def poll_run_and_prepare_response(request: dict, client: OpenAI, db: SessionDep):
     """Handles a thread run, processes the response, and upserts the result to the database."""
     thread_id = request["thread_id"]
     prompt = request["question"]
@@ -286,24 +286,27 @@ def poll_run_and_prepare_response(request: dict, client: OpenAI, db: Session):
     )
 
 
-@router.post("/threads")
+@router.post(
+    "/threads",
+    dependencies=[Depends(require_permission(Permission.REQUIRE_ORGANIZATION))],
+)
 async def threads(
     request: dict,
     background_tasks: BackgroundTasks,
-    _session: Session = Depends(get_db),
-    _current_user: UserOrganization = Depends(get_current_user_org),
+    _session: SessionDep,
+    _current_user: AuthContextDep,
 ):
     """Asynchronous endpoint that processes requests in background."""
     credentials = get_provider_credential(
         session=_session,
-        org_id=_current_user.organization_id,
+        org_id=_current_user.organization_.id,
         provider="openai",
         project_id=request.get("project_id"),
     )
     client, success = configure_openai(credentials)
     if not success:
         logger.error(
-            f"[threads] OpenAI API key not configured for this organization. | organization_id: {_current_user.organization_id}, project_id: {request.get('project_id')}"
+            f"[threads] OpenAI API key not configured for this organization. | organization_id: {_current_user.organization_.id}, project_id: {request.get('project_id')}"
         )
         return APIResponse.failure_response(
             error="OpenAI API key not configured for this organization."
@@ -311,7 +314,7 @@ async def threads(
 
     langfuse_credentials = get_provider_credential(
         session=_session,
-        org_id=_current_user.organization_id,
+        org_id=_current_user.organization_.id,
         provider="langfuse",
         project_id=request.get("project_id"),
     )
@@ -351,21 +354,24 @@ async def threads(
     # Schedule background task
     background_tasks.add_task(process_run, request, client, tracer)
     logger.info(
-        f"[threads] Background task scheduled for thread ID: {mask_string(request.get('thread_id'))} | organization_id: {_current_user.organization_id}, project_id: {request.get('project_id')}"
+        f"[threads] Background task scheduled for thread ID: {mask_string(request.get('thread_id'))} | organization_id: {_current_user.organization_.id}, project_id: {request.get('project_id')}"
     )
     return initial_response
 
 
-@router.post("/threads/sync")
+@router.post(
+    "/threads/sync",
+    dependencies=[Depends(require_permission(Permission.REQUIRE_ORGANIZATION))],
+)
 async def threads_sync(
     request: dict,
-    _session: Session = Depends(get_db),
-    _current_user: UserOrganization = Depends(get_current_user_org),
+    _session: SessionDep,
+    _current_user: AuthContextDep,
 ):
     """Synchronous endpoint that processes requests immediately."""
     credentials = get_provider_credential(
         session=_session,
-        org_id=_current_user.organization_id,
+        org_id=_current_user.organization_.id,
         provider="openai",
         project_id=request.get("project_id"),
     )
@@ -374,7 +380,7 @@ async def threads_sync(
     client, success = configure_openai(credentials)
     if not success:
         logger.error(
-            f"[threads_sync] OpenAI API key not configured for this organization. | organization_id: {_current_user.organization_id}, project_id: {request.get('project_id')}"
+            f"[threads_sync] OpenAI API key not configured for this organization. | organization_id: {_current_user.organization_.id}, project_id: {request.get('project_id')}"
         )
         return APIResponse.failure_response(
             error="OpenAI API key not configured for this organization."
@@ -383,7 +389,7 @@ async def threads_sync(
     # Get Langfuse credentials
     langfuse_credentials = get_provider_credential(
         session=_session,
-        org_id=_current_user.organization_id,
+        org_id=_current_user.organization_.id,
         provider="langfuse",
         project_id=request.get("project_id"),
     )
@@ -416,12 +422,15 @@ async def threads_sync(
     return response
 
 
-@router.post("/threads/start")
+@router.post(
+    "/threads/start",
+    dependencies=[Depends(require_permission(Permission.REQUIRE_PROJECT))],
+)
 async def start_thread(
     request: StartThreadRequest,
     background_tasks: BackgroundTasks,
-    db: Session = Depends(get_db),
-    _current_user: UserProjectOrg = Depends(get_current_user_org_project),
+    db: SessionDep,
+    _current_user: AuthContextDep,
 ):
     """
     Create a new OpenAI thread for the given question and start polling in the background.
@@ -430,16 +439,16 @@ async def start_thread(
     prompt = request["question"]
     credentials = get_provider_credential(
         session=db,
-        org_id=_current_user.organization_id,
+        org_id=_current_user.organization_.id,
         provider="openai",
-        project_id=_current_user.project_id,
+        project_id=_current_user.project_.id,
     )
 
     # Configure OpenAI client
     client, success = configure_openai(credentials)
     if not success:
         logger.error(
-            f"[start_thread] OpenAI API key not configured for this organization. | project_id: {_current_user.project_id}"
+            f"[start_thread] OpenAI API key not configured for this organization. | project_id: {_current_user.project_.id}"
         )
         return APIResponse.failure_response(
             error="OpenAI API key not configured for this organization."
@@ -465,7 +474,7 @@ async def start_thread(
     background_tasks.add_task(poll_run_and_prepare_response, request, client, db)
 
     logger.info(
-        f"[start_thread] Background task scheduled to process response for thread ID: {mask_string(thread_id)} | project_id: {_current_user.project_id}"
+        f"[start_thread] Background task scheduled to process response for thread ID: {mask_string(thread_id)} | project_id: {_current_user.project_.id}"
     )
     return APIResponse.success_response(
         data={
@@ -477,11 +486,14 @@ async def start_thread(
     )
 
 
-@router.get("/threads/result/{thread_id}")
+@router.get(
+    "/threads/result/{thread_id}",
+    dependencies=[Depends(require_permission(Permission.REQUIRE_ORGANIZATION))],
+)
 async def get_thread(
     thread_id: str,
-    db: Session = Depends(get_db),
-    _current_user: UserOrganization = Depends(get_current_user_org),
+    db: SessionDep,
+    _current_user: AuthContextDep,
 ):
     """
     Retrieve the result of a previously started OpenAI thread using its thread ID.
@@ -490,7 +502,7 @@ async def get_thread(
 
     if not result:
         logger.error(
-            f"[get_thread] Thread result not found for ID: {mask_string(thread_id)} | org_id: {_current_user.organization_id}"
+            f"[get_thread] Thread result not found for ID: {mask_string(thread_id)} | org_id: {_current_user.organization_.id}"
         )
         raise HTTPException(404, "thread not found")
 
diff --git a/backend/app/api/routes/users.py b/backend/app/api/routes/users.py
index 5dd83fa34..ba13a6c1c 100644
--- a/backend/app/api/routes/users.py
+++ b/backend/app/api/routes/users.py
@@ -5,10 +5,10 @@
 from sqlmodel import func, select
 
 from app.api.deps import (
-    CurrentUser,
+    AuthContextDep,
     SessionDep,
-    get_current_active_superuser,
 )
+from app.api.permissions import Permission, require_permission
 from app.core.config import settings
 from app.core.security import get_password_hash, verify_password
 from app.crud import create_user, get_user_by_email, update_user
@@ -27,12 +27,12 @@
 from app.core.exception_handlers import HTTPException
 
 logger = logging.getLogger(__name__)
-router = APIRouter(prefix="/users", tags=["users"])
+router = APIRouter(prefix="/users", tags=["Users"])
 
 
 @router.get(
     "/",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=UsersPublic,
     include_in_schema=False,
 )
@@ -44,7 +44,7 @@ def read_users(session: SessionDep, skip: int = 0, limit: int = 100) -> Any:
 
 @router.post(
     "/",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=UserPublic,
     include_in_schema=False,
 )
@@ -74,8 +74,9 @@ def create_user_endpoint(*, session: SessionDep, user_in: UserCreate) -> Any:
 
 @router.patch("/me", response_model=UserPublic)
 def update_user_me(
-    *, session: SessionDep, user_in: UserUpdateMe, current_user: CurrentUser
+    *, session: SessionDep, user_in: UserUpdateMe, current_user_dep: AuthContextDep
 ) -> Any:
+    current_user = current_user_dep.user
     if user_in.email:
         existing_user = get_user_by_email(session=session, email=user_in.email)
         if existing_user and existing_user.id != current_user.id:
@@ -96,8 +97,9 @@ def update_user_me(
 
 @router.patch("/me/password", response_model=Message)
 def update_password_me(
-    *, session: SessionDep, body: UpdatePassword, current_user: CurrentUser
+    *, session: SessionDep, body: UpdatePassword, current_user_dep: AuthContextDep
 ) -> Any:
+    current_user = current_user_dep.user
     if not verify_password(body.current_password, current_user.hashed_password):
         raise HTTPException(status_code=400, detail="Incorrect password")
 
@@ -115,12 +117,13 @@ def update_password_me(
 
 
 @router.get("/me", response_model=UserPublic)
-def read_user_me(current_user: CurrentUser) -> Any:
-    return current_user
+def read_user_me(current_user_dep: AuthContextDep) -> Any:
+    return current_user_dep.user
 
 
 @router.delete("/me", response_model=Message)
-def delete_user_me(session: SessionDep, current_user: CurrentUser) -> Any:
+def delete_user_me(session: SessionDep, current_user_dep: AuthContextDep) -> Any:
+    current_user = current_user_dep.user
     if current_user.is_superuser:
         logger.error(
             f"[delete_user_me] Attempting to delete superuser account by itself | user_id: {current_user.id}"
@@ -136,7 +139,7 @@ def delete_user_me(session: SessionDep, current_user: CurrentUser) -> Any:
 
 @router.post(
     "/signup",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=UserPublic,
 )
 def register_user(session: SessionDep, user_in: UserRegister) -> Any:
@@ -158,13 +161,13 @@ def register_user(session: SessionDep, user_in: UserRegister) -> Any:
 
 @router.get("/{user_id}", response_model=UserPublic, include_in_schema=False)
 def read_user_by_id(
-    user_id: int, session: SessionDep, current_user: CurrentUser
+    user_id: int, session: SessionDep, current_user: AuthContextDep
 ) -> Any:
     user = session.get(User, user_id)
-    if user == current_user:
+    if user == current_user.user:
         return user
 
-    if not current_user.is_superuser:
+    if not current_user.user.is_superuser:
         raise HTTPException(
             status_code=403,
             detail="The user doesn't have enough privileges",
@@ -175,7 +178,7 @@ def read_user_by_id(
 
 @router.patch(
     "/{user_id}",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     response_model=UserPublic,
     include_in_schema=False,
 )
@@ -208,20 +211,20 @@ def update_user_endpoint(
 
 @router.delete(
     "/{user_id}",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     include_in_schema=False,
 )
 def delete_user(
-    session: SessionDep, current_user: CurrentUser, user_id: int
+    session: SessionDep, current_user: AuthContextDep, user_id: int
 ) -> Message:
     user = session.get(User, user_id)
     if not user:
         logger.error(f"[delete_user] User not found | user_id: {user_id}")
         raise HTTPException(status_code=404, detail="User not found")
 
-    if user == current_user:
+    if user == current_user.user:
         logger.error(
-            f"[delete_user] Attempting to delete self by superuser | user_id: {current_user.id}"
+            f"[delete_user] Attempting to delete self by superuser | user_id: {current_user.user.id}"
         )
         raise HTTPException(
             status_code=403, detail="Super users are not allowed to delete themselves"
diff --git a/backend/app/api/routes/utils.py b/backend/app/api/routes/utils.py
index 294f729c7..56247b304 100644
--- a/backend/app/api/routes/utils.py
+++ b/backend/app/api/routes/utils.py
@@ -1,16 +1,16 @@
 from fastapi import APIRouter, Depends
 from pydantic.networks import EmailStr
 
-from app.api.deps import get_current_active_superuser
 from app.models import Message
 from app.utils import generate_test_email, send_email
+from app.api.permissions import Permission, require_permission
 
 router = APIRouter(prefix="/utils", tags=["utils"])
 
 
 @router.post(
     "/test-email/",
-    dependencies=[Depends(get_current_active_superuser)],
+    dependencies=[Depends(require_permission(Permission.SUPERUSER))],
     status_code=201,
     include_in_schema=False,
 )
diff --git a/backend/app/core/cloud/storage.py b/backend/app/core/cloud/storage.py
index 95c7f0ddf..a3247b74b 100644
--- a/backend/app/core/cloud/storage.py
+++ b/backend/app/core/cloud/storage.py
@@ -14,7 +14,6 @@
 from botocore.response import StreamingBody
 
 from app.crud import get_project_by_id
-from app.models import UserProjectOrg
 from app.core.config import settings
 from app.utils import mask_string
 
diff --git a/backend/app/core/config.py b/backend/app/core/config.py
index d318ce98f..40c770541 100644
--- a/backend/app/core/config.py
+++ b/backend/app/core/config.py
@@ -42,6 +42,7 @@ class Settings(BaseSettings):
     ] = "development"
 
     PROJECT_NAME: str
+    API_VERSION: str = "0.5.0"
     SENTRY_DSN: HttpUrl | None = None
     POSTGRES_SERVER: str
     POSTGRES_PORT: int = 5432
diff --git a/backend/app/core/langfuse/langfuse.py b/backend/app/core/langfuse/langfuse.py
index 8fd0aa8bd..287790131 100644
--- a/backend/app/core/langfuse/langfuse.py
+++ b/backend/app/core/langfuse/langfuse.py
@@ -1,10 +1,12 @@
 import uuid
 import logging
-from typing import Any, Dict, Optional
+from typing import Any, Callable, Dict, Optional
+from functools import wraps
 
 from asgi_correlation_id import correlation_id
 from langfuse import Langfuse
 from langfuse.client import StatefulGenerationClient, StatefulTraceClient
+from app.models.llm import NativeCompletionConfig, QueryParams, LLMCallResponse
 
 logger = logging.getLogger(__name__)
 
@@ -107,3 +109,104 @@ def log_error(self, error_message: str, response_id: Optional[str] = None):
 
     def flush(self):
         self.langfuse.flush()
+
+
+def observe_llm_execution(
+    session_id: str | None = None,
+    credentials: dict | None = None,
+):
+    """Decorator to add Langfuse observability to LLM provider execute methods.
+
+    Args:
+        credentials: Langfuse credentials with public_key, secret_key, and host
+        session_id: Session ID for grouping traces (conversation_id)
+
+    Usage:
+        decorated_execute = observe_llm_execution(
+            credentials=langfuse_creds,
+            session_id=conversation_id
+        )(provider_instance.execute)
+    """
+
+    def decorator(func: Callable) -> Callable:
+        @wraps(func)
+        def wrapper(
+            completion_config: NativeCompletionConfig, query: QueryParams, **kwargs
+        ):
+            # Skip observability if no credentials provided
+            if not credentials:
+                logger.info("[Langfuse] No credentials - skipping observability")
+                return func(completion_config, query, **kwargs)
+
+            try:
+                langfuse = Langfuse(
+                    public_key=credentials.get("public_key"),
+                    secret_key=credentials.get("secret_key"),
+                    host=credentials.get("host"),
+                )
+            except Exception as e:
+                logger.warning(f"[Langfuse] Failed to initialize client: {e}")
+                return func(completion_config, query, **kwargs)
+
+            trace = langfuse.trace(
+                name="unified-llm-call",
+                input=query.input,
+                tags=[completion_config.provider],
+            )
+
+            generation = trace.generation(
+                name=f"{completion_config.provider}-completion",
+                input=query.input,
+                model=completion_config.params.get("model"),
+            )
+
+            try:
+                # Execute the actual LLM call
+                response: LLMCallResponse | None
+                error: str | None
+                response, error = func(completion_config, query, **kwargs)
+
+                if response:
+                    generation.end(
+                        output={
+                            "status": "success",
+                            "output": response.response.output.text,
+                        },
+                        usage_details={
+                            "input": response.usage.input_tokens,
+                            "output": response.usage.output_tokens,
+                        },
+                        model=response.response.model,
+                    )
+
+                    trace.update(
+                        output={
+                            "status": "success",
+                            "output": response.response.output.text,
+                        },
+                        session_id=session_id or response.response.conversation_id,
+                    )
+                else:
+                    error_msg = error or "Unknown error"
+                    generation.end(output={"error": error_msg})
+                    trace.update(
+                        output={"status": "failure", "error": error_msg},
+                        session_id=session_id,
+                    )
+
+                langfuse.flush()
+                return response, error
+
+            except Exception as e:
+                error_msg = str(e)
+                generation.end(output={"error": error_msg})
+                trace.update(
+                    output={"status": "failure", "error": error_msg},
+                    session_id=session_id,
+                )
+                langfuse.flush()
+                raise
+
+        return wrapper
+
+    return decorator
diff --git a/backend/app/core/security.py b/backend/app/core/security.py
index 8807d7052..8cee6e982 100644
--- a/backend/app/core/security.py
+++ b/backend/app/core/security.py
@@ -110,44 +110,6 @@ def get_password_hash(password: str) -> str:
     return pwd_context.hash(password)
 
 
-def encrypt_api_key(api_key: str) -> str:
-    """
-    Encrypt an API key before storage.
-
-    Args:
-        api_key: The plain text API key to encrypt
-
-    Returns:
-        str: The encrypted API key
-
-    Raises:
-        ValueError: If encryption fails
-    """
-    try:
-        return get_fernet().encrypt(api_key.encode()).decode()
-    except Exception as e:
-        raise ValueError(f"Failed to encrypt API key: {e}")
-
-
-def decrypt_api_key(encrypted_api_key: str) -> str:
-    """
-    Decrypt an API key when retrieving it.
-
-    Args:
-        encrypted_api_key: The encrypted API key to decrypt
-
-    Returns:
-        str: The decrypted API key
-
-    Raises:
-        ValueError: If decryption fails
-    """
-    try:
-        return get_fernet().decrypt(encrypted_api_key.encode()).decode()
-    except Exception as e:
-        raise ValueError(f"Failed to decrypt API key: {e}")
-
-
 def encrypt_credentials(credentials: dict) -> str:
     """
     Encrypt the entire credentials object before storage.
diff --git a/backend/app/crud/evaluations/langfuse.py b/backend/app/crud/evaluations/langfuse.py
index 1dfdd21d4..01a1104ae 100644
--- a/backend/app/crud/evaluations/langfuse.py
+++ b/backend/app/crud/evaluations/langfuse.py
@@ -9,6 +9,7 @@
 """
 
 import logging
+from concurrent.futures import ThreadPoolExecutor, as_completed
 from typing import Any
 
 import numpy as np
@@ -247,42 +248,55 @@ def upload_dataset_to_langfuse(
         f"duplication_factor={duplication_factor}"
     )
 
+    def upload_item(item: dict[str, str], duplicate_num: int) -> bool:
+        try:
+            langfuse.create_dataset_item(
+                dataset_name=dataset_name,
+                input={"question": item["question"]},
+                expected_output={"answer": item["answer"]},
+                metadata={
+                    "original_question": item["question"],
+                    "duplicate_number": duplicate_num + 1,
+                    "duplication_factor": duplication_factor,
+                },
+            )
+            return True
+        except Exception as e:
+            logger.error(
+                f"[upload_dataset_to_langfuse] Failed to upload item | "
+                f"duplicate={duplicate_num + 1} | "
+                f"question={item['question'][:50]}... | {e}"
+            )
+            return False
+
     try:
         # Create or get dataset in Langfuse
         dataset = langfuse.create_dataset(name=dataset_name)
 
-        # Upload items with duplication
+        upload_tasks = [
+            (item, duplicate_num)
+            for item in items
+            for duplicate_num in range(duplication_factor)
+        ]
+
+        # Upload items concurrently using ThreadPoolExecutor
         total_uploaded = 0
-        for item in items:
-            # Duplicate each item N times
-            for duplicate_num in range(duplication_factor):
-                try:
-                    langfuse.create_dataset_item(
-                        dataset_name=dataset_name,
-                        input={"question": item["question"]},
-                        expected_output={"answer": item["answer"]},
-                        metadata={
-                            "original_question": item["question"],
-                            "duplicate_number": duplicate_num + 1,
-                            "duplication_factor": duplication_factor,
-                        },
-                    )
+        with ThreadPoolExecutor(max_workers=4) as executor:
+            # Submit all upload tasks and collect the futures
+            futures = []
+            for item, dup_num in upload_tasks:
+                future = executor.submit(upload_item, item, dup_num)
+                futures.append(future)
+
+            for future in as_completed(futures):
+                upload_successful = future.result()
+                if upload_successful:
                     total_uploaded += 1
-                except Exception as e:
-                    logger.error(
-                        f"[upload_dataset_to_langfuse] Failed to upload item | "
-                        f"duplicate={duplicate_num + 1} | "
-                        f"question={item['question'][:50]}... | {e}"
-                    )
-
-            # Flush after each original item's duplicates to prevent race conditions
-            # in Langfuse SDK's internal batching that could mix up Q&A pairs
-            langfuse.flush()
 
         # Final flush to ensure all items are uploaded
         langfuse.flush()
 
-        langfuse_dataset_id = dataset.id if hasattr(dataset, "id") else None
+        langfuse_dataset_id = dataset.id
 
         logger.info(
             f"[upload_dataset_to_langfuse] Successfully uploaded to Langfuse | "
diff --git a/backend/app/main.py b/backend/app/main.py
index 0b4734845..47a27f371 100644
--- a/backend/app/main.py
+++ b/backend/app/main.py
@@ -2,8 +2,10 @@
 
 from fastapi import FastAPI
 from fastapi.routing import APIRoute
+from fastapi.openapi.utils import get_openapi
 from asgi_correlation_id.middleware import CorrelationIdMiddleware
 from app.api.main import api_router
+from app.api.docs.openapi_config import tags_metadata, customize_openapi_schema
 from app.core.config import settings
 from app.core.exception_handlers import register_exception_handlers
 from app.core.middleware import http_request_logger
@@ -25,8 +27,31 @@ def custom_generate_unique_id(route: APIRoute) -> str:
     title=settings.PROJECT_NAME,
     openapi_url=f"{settings.API_V1_STR}/openapi.json",
     generate_unique_id_function=custom_generate_unique_id,
+    description="**Responsible AI for the development sector**",
 )
 
+
+def custom_openapi():
+    if app.openapi_schema:
+        return app.openapi_schema
+
+    openapi_schema = get_openapi(
+        title=app.title,
+        version=settings.API_VERSION,
+        openapi_version=app.openapi_version,
+        description=app.description,
+        routes=app.routes,
+        tags=tags_metadata,
+    )
+
+    openapi_schema = customize_openapi_schema(openapi_schema)
+
+    app.openapi_schema = openapi_schema
+    return app.openapi_schema
+
+
+app.openapi = custom_openapi
+
 app.middleware("http")(http_request_logger)
 app.add_middleware(CorrelationIdMiddleware)
 
diff --git a/backend/app/models/__init__.py b/backend/app/models/__init__.py
index 9a3518251..ac7e89d6c 100644
--- a/backend/app/models/__init__.py
+++ b/backend/app/models/__init__.py
@@ -142,8 +142,6 @@
     NewPassword,
     User,
     UserCreate,
-    UserOrganization,
-    UserProjectOrg,
     UserPublic,
     UserRegister,
     UserUpdate,
diff --git a/backend/app/models/auth.py b/backend/app/models/auth.py
index adb93aeb9..26b42ef8a 100644
--- a/backend/app/models/auth.py
+++ b/backend/app/models/auth.py
@@ -2,6 +2,7 @@
 from app.models.user import User
 from app.models.organization import Organization
 from app.models.project import Project
+from typing import TYPE_CHECKING
 
 
 # JSON payload containing access token
@@ -19,3 +20,17 @@ class AuthContext(SQLModel):
     user: User
     organization: Organization | None = None
     project: Project | None = None
+
+    @property
+    def organization_(self) -> Organization:
+        """Non-optional organization - raises if None"""
+        if self.organization is None:
+            raise ValueError("Organization is required but was None")
+        return self.organization
+
+    @property
+    def project_(self) -> Project:
+        """Non-optional project - raises if None"""
+        if self.project is None:
+            raise ValueError("Project is required but was None")
+        return self.project
diff --git a/backend/app/models/collection.py b/backend/app/models/collection.py
index 353deef00..57e5a17bb 100644
--- a/backend/app/models/collection.py
+++ b/backend/app/models/collection.py
@@ -1,5 +1,5 @@
 from datetime import datetime
-from typing import Any
+from typing import Any, Literal
 from uuid import UUID, uuid4
 
 from pydantic import HttpUrl, model_validator
@@ -28,7 +28,7 @@ class Collection(SQLModel, table=True):
     )
     llm_service_name: str = Field(
         nullable=False,
-        sa_column_kwargs={"comment": "Name of the LLM service provider"},
+        sa_column_kwargs={"comment": "Name of the LLM service"},
     )
 
     # Foreign keys
@@ -89,7 +89,7 @@ class AssistantOptions(SQLModel):
     model: str | None = Field(
         default=None,
         description=(
-            "**[To Be Deprecated]**  "
+            "**[Deprecated]**  "
             "OpenAI model to attach to this assistant. The model "
             "must be compatable with the assistants API; see the "
             "OpenAI [model documentation](https://platform.openai.com/docs/models/compare) for more."
@@ -99,7 +99,7 @@ class AssistantOptions(SQLModel):
     instructions: str | None = Field(
         default=None,
         description=(
-            "**[To Be Deprecated]**  "
+            "**[Deprecated]**  "
             "Assistant instruction. Sometimes referred to as the "
             '"system" prompt.'
         ),
@@ -107,7 +107,7 @@ class AssistantOptions(SQLModel):
     temperature: float = Field(
         default=1e-6,
         description=(
-            "**[To Be Deprecated]**  "
+            "**[Deprecated]**  "
             "Model temperature. The default is slightly "
             "greater-than zero because it is [unknown how OpenAI "
             "handles zero](https://community.openai.com/t/clarifications-on-setting-temperature-0/886447/5)."
@@ -145,8 +145,17 @@ class CallbackRequest(SQLModel):
     )
 
 
+class ProviderOptions(SQLModel):
+    """LLM provider configuration."""
+
+    provider: Literal["openai"] = Field(
+        default="openai", description="LLM provider to use for this collection"
+    )
+
+
 class CreationRequest(
     DocumentOptions,
+    ProviderOptions,
     AssistantOptions,
     CallbackRequest,
 ):
diff --git a/backend/app/models/llm/__init__.py b/backend/app/models/llm/__init__.py
index f06954de5..8738e2126 100644
--- a/backend/app/models/llm/__init__.py
+++ b/backend/app/models/llm/__init__.py
@@ -3,5 +3,8 @@
     CompletionConfig,
     QueryParams,
     ConfigBlob,
+    KaapiLLMParams,
+    KaapiCompletionConfig,
+    NativeCompletionConfig,
 )
 from app.models.llm.response import LLMCallResponse, LLMResponse, LLMOutput, Usage
diff --git a/backend/app/models/llm/request.py b/backend/app/models/llm/request.py
index a63de1ebc..fc44235f9 100644
--- a/backend/app/models/llm/request.py
+++ b/backend/app/models/llm/request.py
@@ -1,8 +1,44 @@
-from typing import Any, Literal
+from typing import Annotated, Any, Literal, Union
 
 from uuid import UUID
 from sqlmodel import Field, SQLModel
-from pydantic import model_validator, HttpUrl
+from pydantic import Discriminator, model_validator, HttpUrl
+
+
+class KaapiLLMParams(SQLModel):
+    """
+    Kaapi-abstracted parameters for LLM providers.
+    These parameters are mapped internally to provider-specific API parameters.
+    Provides a unified contract across all LLM providers (OpenAI, Claude, Gemini, etc.).
+    Provider-specific mappings are handled at the mapper level.
+    """
+
+    model: str = Field(
+        description="Model identifier to use for completion (e.g., 'gpt-4o', 'gpt-5')",
+    )
+    instructions: str | None = Field(
+        default=None,
+        description="System instructions to guide the model's behavior",
+    )
+    knowledge_base_ids: list[str] | None = Field(
+        default=None,
+        description="List of vector store IDs to use for knowledge retrieval",
+    )
+    reasoning: Literal["low", "medium", "high"] | None = Field(
+        default=None,
+        description="Reasoning configuration or instructions",
+    )
+    temperature: float | None = Field(
+        default=None,
+        ge=0.0,
+        le=2.0,
+        description="Sampling temperature between 0 and 2",
+    )
+    max_num_results: int | None = Field(
+        default=None,
+        ge=1,
+        description="Maximum number of results to return",
+    )
 
 
 class ConversationConfig(SQLModel):
@@ -46,11 +82,16 @@ class QueryParams(SQLModel):
     )
 
 
-class CompletionConfig(SQLModel):
-    """Completion configuration with provider and parameters."""
+class NativeCompletionConfig(SQLModel):
+    """
+    Native provider configuration (pass-through).
+    All parameters are forwarded as-is to the provider's API without transformation.
+    Supports any LLM provider's native API format.
+    """
 
-    provider: Literal["openai"] = Field(
-        default="openai", description="LLM provider to use"
+    provider: Literal["openai-native"] = Field(
+        default="openai-native",
+        description="Native provider type (e.g., openai-native)",
     )
     params: dict[str, Any] = Field(
         ...,
@@ -58,6 +99,27 @@ class CompletionConfig(SQLModel):
     )
 
 
+class KaapiCompletionConfig(SQLModel):
+    """
+    Kaapi abstraction for LLM completion providers.
+    Uses standardized Kaapi parameters that are mapped to provider-specific APIs internally.
+    Supports multiple providers: OpenAI, Claude, Gemini, etc.
+    """
+
+    provider: Literal["openai"] = Field(..., description="LLM provider (openai)")
+    params: KaapiLLMParams = Field(
+        ...,
+        description="Kaapi-standardized parameters mapped to provider-specific API",
+    )
+
+
+# Discriminated union for completion configs based on provider field
+CompletionConfig = Annotated[
+    Union[NativeCompletionConfig, KaapiCompletionConfig],
+    Field(discriminator="provider"),
+]
+
+
 class ConfigBlob(SQLModel):
     """Raw JSON blob of config."""
 
diff --git a/backend/app/models/user.py b/backend/app/models/user.py
index b3d309741..f38aafc2a 100644
--- a/backend/app/models/user.py
+++ b/backend/app/models/user.py
@@ -75,15 +75,6 @@ class User(UserBase, table=True):
     )
 
 
-class UserOrganization(UserBase):
-    id: int
-    organization_id: int | None
-
-
-class UserProjectOrg(UserOrganization):
-    project_id: int
-
-
 # Properties to return via API, id is always required
 class UserPublic(UserBase):
     id: int
diff --git a/backend/app/services/doctransform/job.py b/backend/app/services/doctransform/job.py
index 8245b213b..3018ffc4b 100644
--- a/backend/app/services/doctransform/job.py
+++ b/backend/app/services/doctransform/job.py
@@ -22,7 +22,6 @@
     DocTransformationJob,
 )
 from app.core.cloud import get_cloud_storage
-from app.api.deps import CurrentUserOrgProject
 from app.celery.utils import start_low_priority_job
 from app.utils import send_callback, APIResponse
 from app.services.doctransform.registry import convert_document, FORMAT_TO_EXTENSION
@@ -33,19 +32,17 @@
 
 def start_job(
     db: Session,
-    current_user: CurrentUserOrgProject,
+    project_id: int,
     job_id: UUID,
     transformer_name: str,
     target_format: str,
     callback_url: str | None,
 ) -> str:
     trace_id = correlation_id.get() or "N/A"
-    job_crud = DocTransformationJobCrud(db, project_id=current_user.project_id)
+    job_crud = DocTransformationJobCrud(db, project_id=project_id)
     job_crud.update(job_id, DocTransformJobUpdate(trace_id=trace_id))
     job = job_crud.read_one(job_id)
 
-    project_id = current_user.project_id
-
     task_id = start_low_priority_job(
         function_path="app.services.doctransform.job.execute_job",
         project_id=project_id,
diff --git a/backend/app/services/documents/helpers.py b/backend/app/services/documents/helpers.py
index 7319830bf..cd941eb55 100644
--- a/backend/app/services/documents/helpers.py
+++ b/backend/app/services/documents/helpers.py
@@ -70,7 +70,6 @@ def schedule_transformation(
     *,
     session,
     project_id: int,
-    current_user,
     source_format: str,
     target_format: str | None,
     actual_transformer: str | None,
@@ -92,7 +91,7 @@ def schedule_transformation(
     transformation_job_id = transformation_job.start_job(
         db=session,
         job_id=job.id,
-        current_user=current_user,
+        project_id=project_id,
         transformer_name=actual_transformer,
         target_format=target_format,
         callback_url=callback_url,
diff --git a/backend/app/services/llm/jobs.py b/backend/app/services/llm/jobs.py
index a8ad9d83c..773fe7fc3 100644
--- a/backend/app/services/llm/jobs.py
+++ b/backend/app/services/llm/jobs.py
@@ -7,12 +7,15 @@
 
 from app.core.db import engine
 from app.crud.config import ConfigVersionCrud
+from app.crud.credentials import get_provider_credential
 from app.crud.jobs import JobCrud
 from app.models import JobStatus, JobType, JobUpdate, LLMCallRequest
-from app.models.llm.request import ConfigBlob, LLMCallConfig
+from app.models.llm.request import ConfigBlob, LLMCallConfig, KaapiCompletionConfig
 from app.utils import APIResponse, send_callback
 from app.celery.utils import start_high_priority_job
+from app.core.langfuse.langfuse import observe_llm_execution
 from app.services.llm.providers.registry import get_llm_provider
+from app.services.llm.mappers import transform_kaapi_config_to_native
 
 
 logger = logging.getLogger(__name__)
@@ -168,10 +171,27 @@ def execute_job(
             else:
                 config_blob = config.blob
 
+            try:
+                # Transform Kaapi config to native config if needed (before getting provider)
+                completion_config = config_blob.completion
+                if isinstance(completion_config, KaapiCompletionConfig):
+                    completion_config, warnings = transform_kaapi_config_to_native(
+                        completion_config
+                    )
+                    if request.request_metadata is None:
+                        request.request_metadata = {}
+                    request.request_metadata.setdefault("warnings", []).extend(warnings)
+            except Exception as e:
+                callback_response = APIResponse.failure_response(
+                    error=f"Error processing configuration: {str(e)}",
+                    metadata=request.request_metadata,
+                )
+                return handle_job_error(job_id, request.callback_url, callback_response)
+
             try:
                 provider_instance = get_llm_provider(
                     session=session,
-                    provider_type=config_blob.completion.provider,
+                    provider_type=completion_config.provider,  # Now always native provider type
                     project_id=project_id,
                     organization_id=organization_id,
                 )
@@ -182,8 +202,26 @@ def execute_job(
                 )
                 return handle_job_error(job_id, request.callback_url, callback_response)
 
-        response, error = provider_instance.execute(
-            completion_config=config_blob.completion,
+            langfuse_credentials = get_provider_credential(
+                session=session,
+                org_id=organization_id,
+                project_id=project_id,
+                provider="langfuse",
+            )
+
+        # Extract conversation_id for langfuse session grouping
+        conversation_id = None
+        if request.query.conversation and request.query.conversation.id:
+            conversation_id = request.query.conversation.id
+
+        # Apply Langfuse observability decorator to provider execute method
+        decorated_execute = observe_llm_execution(
+            credentials=langfuse_credentials,
+            session_id=conversation_id,
+        )(provider_instance.execute)
+
+        response, error = decorated_execute(
+            completion_config=completion_config,
             query=request.query,
             include_provider_raw_response=request.include_provider_raw_response,
         )
diff --git a/backend/app/services/llm/mappers.py b/backend/app/services/llm/mappers.py
new file mode 100644
index 000000000..9e076aa9a
--- /dev/null
+++ b/backend/app/services/llm/mappers.py
@@ -0,0 +1,94 @@
+"""Parameter mappers for converting Kaapi-abstracted parameters to provider-specific formats."""
+
+import litellm
+from app.models.llm import KaapiLLMParams, KaapiCompletionConfig, NativeCompletionConfig
+
+
+def map_kaapi_to_openai_params(kaapi_params: KaapiLLMParams) -> tuple[dict, list[str]]:
+    """Map Kaapi-abstracted parameters to OpenAI API parameters.
+
+    This mapper transforms standardized Kaapi parameters into OpenAI-specific
+    parameter format, enabling provider-agnostic interface design.
+
+    Args:
+        kaapi_params: KaapiLLMParams instance with standardized parameters
+
+    Supported Mapping:
+        - model → model
+        - instructions → instructions
+        - knowledge_base_ids → tools[file_search].vector_store_ids
+        - max_num_results → tools[file_search].max_num_results (fallback default)
+        - reasoning → reasoning.effort (if reasoning supported by model else suppressed)
+        - temperature → temperature (if reasoning not supported by model else suppressed)
+
+    Returns:
+        Tuple of:
+        - Dictionary of OpenAI API parameters ready to be passed to the API
+        - List of warnings describing suppressed or ignored parameters
+    """
+    openai_params = {}
+    warnings = []
+
+    support_reasoning = litellm.supports_reasoning(
+        model="openai/" + f"{kaapi_params.model}"
+    )
+
+    # Handle reasoning vs temperature mutual exclusivity
+    if support_reasoning:
+        if kaapi_params.reasoning is not None:
+            openai_params["reasoning"] = {"effort": kaapi_params.reasoning}
+
+        if kaapi_params.temperature is not None:
+            warnings.append(
+                "Parameter 'temperature' was suppressed because the selected model "
+                "supports reasoning, and temperature is ignored when reasoning is enabled."
+            )
+    else:
+        if kaapi_params.reasoning is not None:
+            warnings.append(
+                "Parameter 'reasoning' was suppressed because the selected model "
+                "does not support reasoning."
+            )
+
+        if kaapi_params.temperature is not None:
+            openai_params["temperature"] = kaapi_params.temperature
+
+    if kaapi_params.model:
+        openai_params["model"] = kaapi_params.model
+
+    if kaapi_params.instructions:
+        openai_params["instructions"] = kaapi_params.instructions
+
+    if kaapi_params.knowledge_base_ids:
+        openai_params["tools"] = [
+            {
+                "type": "file_search",
+                "vector_store_ids": kaapi_params.knowledge_base_ids,
+                "max_num_results": kaapi_params.max_num_results or 20,
+            }
+        ]
+
+    return openai_params, warnings
+
+
+def transform_kaapi_config_to_native(
+    kaapi_config: KaapiCompletionConfig,
+) -> tuple[NativeCompletionConfig, list[str]]:
+    """Transform Kaapi completion config to native provider config with mapped parameters.
+
+    Currently supports OpenAI. Future: Claude, Gemini mappers.
+
+    Args:
+        kaapi_config: KaapiCompletionConfig with abstracted parameters
+
+    Returns:
+        NativeCompletionConfig with provider-native parameters ready for API
+    """
+    if kaapi_config.provider == "openai":
+        mapped_params, warnings = map_kaapi_to_openai_params(kaapi_config.params)
+        return (
+            NativeCompletionConfig(provider="openai-native", params=mapped_params),
+            warnings,
+        )
+
+    raise ValueError(f"Unsupported provider: {kaapi_config.provider}")
diff --git a/backend/app/services/llm/providers/base.py b/backend/app/services/llm/providers/base.py
index ecca36fcc..827f25910 100644
--- a/backend/app/services/llm/providers/base.py
+++ b/backend/app/services/llm/providers/base.py
@@ -7,7 +7,7 @@
 from abc import ABC, abstractmethod
 from typing import Any
 
-from app.models.llm import CompletionConfig, LLMCallResponse, QueryParams
+from app.models.llm import NativeCompletionConfig, LLMCallResponse, QueryParams
 
 
 class BaseProvider(ABC):
@@ -34,7 +34,7 @@ def __init__(self, client: Any):
     @abstractmethod
     def execute(
         self,
-        completion_config: CompletionConfig,
+        completion_config: NativeCompletionConfig,
         query: QueryParams,
         include_provider_raw_response: bool = False,
     ) -> tuple[LLMCallResponse | None, str | None]:
@@ -43,7 +43,7 @@ def execute(
         Directly passes the user's config params to provider API along with input.
 
         Args:
-            completion_config: LLM completion configuration
+            completion_config: LLM completion configuration, pass params as-is to provider API
             query: Query parameters including input and conversation_id
             include_provider_raw_response: Whether to include the raw LLM provider response in the output
 
diff --git a/backend/app/services/llm/providers/openai.py b/backend/app/services/llm/providers/openai.py
index f24094a86..34e35e17e 100644
--- a/backend/app/services/llm/providers/openai.py
+++ b/backend/app/services/llm/providers/openai.py
@@ -5,7 +5,7 @@
 from openai.types.responses.response import Response
 
 from app.models.llm import (
-    CompletionConfig,
+    NativeCompletionConfig,
     LLMCallResponse,
     QueryParams,
     LLMOutput,
@@ -30,7 +30,7 @@ def __init__(self, client: OpenAI):
 
     def execute(
         self,
-        completion_config: CompletionConfig,
+        completion_config: NativeCompletionConfig,
         query: QueryParams,
         include_provider_raw_response: bool = False,
     ) -> tuple[LLMCallResponse | None, str | None]:
diff --git a/backend/app/services/llm/providers/registry.py b/backend/app/services/llm/providers/registry.py
index 64b32c436..a5cfb4bb8 100644
--- a/backend/app/services/llm/providers/registry.py
+++ b/backend/app/services/llm/providers/registry.py
@@ -12,15 +12,16 @@
 
 
 class LLMProvider:
-    OPENAI = "openai"
-    # Future constants:
-    # ANTHROPIC = "anthropic"
-    # GOOGLE = "google"
+    OPENAI_NATIVE = "openai-native"
+    # Future constants for native providers:
+    # CLAUDE_NATIVE = "claude-native"
+    # GEMINI_NATIVE = "gemini-native"
 
     _registry: dict[str, type[BaseProvider]] = {
-        OPENAI: OpenAIProvider,
-        # ANTHROPIC: AnthropicProvider,
-        # GOOGLE: GoogleProvider,
+        OPENAI_NATIVE: OpenAIProvider,
+        # Future native providers:
+        # CLAUDE_NATIVE: ClaudeProvider,
+        # GEMINI_NATIVE: GeminiProvider,
     }
 
     @classmethod
@@ -45,19 +46,22 @@ def get_llm_provider(
 ) -> BaseProvider:
     provider_class = LLMProvider.get(provider_type)
 
+    # e.g., "openai-native" → "openai", "claude-native" → "claude"
+    credential_provider = provider_type.replace("-native", "")
+
     credentials = get_provider_credential(
         session=session,
-        provider=provider_type,
+        provider=credential_provider,
         project_id=project_id,
         org_id=organization_id,
     )
 
     if not credentials:
         raise ValueError(
-            f"Credentials for provider '{provider_type}' not configured for this project."
+            f"Credentials for provider '{credential_provider}' not configured for this project."
         )
 
-    if provider_type == LLMProvider.OPENAI:
+    if provider_type == LLMProvider.OPENAI_NATIVE:
         if "api_key" not in credentials:
             raise ValueError("OpenAI credentials not configured for this project.")
         client = OpenAI(api_key=credentials["api_key"])
diff --git a/backend/app/tests/api/routes/configs/test_config.py b/backend/app/tests/api/routes/configs/test_config.py
index 8f094f538..a30d162a0 100644
--- a/backend/app/tests/api/routes/configs/test_config.py
+++ b/backend/app/tests/api/routes/configs/test_config.py
@@ -19,7 +19,7 @@ def test_create_config_success(
         "description": "A test LLM configuration",
         "config_blob": {
             "completion": {
-                "provider": "openai",
+                "provider": "openai-native",
                 "params": {
                     "model": "gpt-4",
                     "temperature": 0.8,
diff --git a/backend/app/tests/api/routes/configs/test_version.py b/backend/app/tests/api/routes/configs/test_version.py
index acb9f2526..27fb14ba2 100644
--- a/backend/app/tests/api/routes/configs/test_version.py
+++ b/backend/app/tests/api/routes/configs/test_version.py
@@ -10,7 +10,8 @@
     create_test_project,
     create_test_version,
 )
-from app.models import ConfigBlob, CompletionConfig
+from app.models import ConfigBlob
+from app.models.llm.request import NativeCompletionConfig
 
 
 def test_create_version_success(
@@ -28,7 +29,7 @@ def test_create_version_success(
     version_data = {
         "config_blob": {
             "completion": {
-                "provider": "openai",
+                "provider": "openai-native",
                 "params": {
                     "model": "gpt-4-turbo",
                     "temperature": 0.9,
@@ -303,8 +304,8 @@ def test_get_version_by_number(
         config_id=config.id,
         project_id=user_api_key.project_id,
         config_blob=ConfigBlob(
-            completion=CompletionConfig(
-                provider="openai",
+            completion=NativeCompletionConfig(
+                provider="openai-native",
                 params={"model": "gpt-4-turbo", "temperature": 0.5},
             )
         ),
diff --git a/backend/app/tests/api/routes/test_llm.py b/backend/app/tests/api/routes/test_llm.py
index 430ca77c0..78adffebf 100644
--- a/backend/app/tests/api/routes/test_llm.py
+++ b/backend/app/tests/api/routes/test_llm.py
@@ -6,6 +6,9 @@
     LLMCallConfig,
     CompletionConfig,
     ConfigBlob,
+    KaapiLLMParams,
+    KaapiCompletionConfig,
+    NativeCompletionConfig,
 )
 
 
@@ -18,8 +21,8 @@ def test_llm_call_success(client: TestClient, user_api_key_header: dict[str, str
             query=QueryParams(input="What is the capital of France?"),
             config=LLMCallConfig(
                 blob=ConfigBlob(
-                    completion=CompletionConfig(
-                        provider="openai",
+                    completion=NativeCompletionConfig(
+                        provider="openai-native",
                         params={
                             "model": "gpt-4",
                             "temperature": 0.7,
@@ -43,3 +46,117 @@ def test_llm_call_success(client: TestClient, user_api_key_header: dict[str, str
         assert "response is being generated" in response_data["data"]["message"]
 
         mock_start_job.assert_called_once()
+
+
+def test_llm_call_with_kaapi_config(
+    client: TestClient, user_api_key_header: dict[str, str]
+):
+    """Test LLM call with Kaapi abstracted config."""
+    with patch("app.services.llm.jobs.start_high_priority_job") as mock_start_job:
+        mock_start_job.return_value = "test-task-id"
+
+        payload = LLMCallRequest(
+            query=QueryParams(input="Explain quantum computing"),
+            config=LLMCallConfig(
+                blob=ConfigBlob(
+                    completion=KaapiCompletionConfig(
+                        provider="openai",
+                        params=KaapiLLMParams(
+                            model="gpt-4o",
+                            instructions="You are a physics expert",
+                            temperature=0.5,
+                        ),
+                    )
+                )
+            ),
+        )
+
+        response = client.post(
+            "api/v1/llm/call",
+            json=payload.model_dump(mode="json"),
+            headers=user_api_key_header,
+        )
+
+        assert response.status_code == 200
+        response_data = response.json()
+        assert response_data["success"] is True
+        mock_start_job.assert_called_once()
+
+
+def test_llm_call_with_native_config(
+    client: TestClient, user_api_key_header: dict[str, str]
+):
+    """Test LLM call with native OpenAI config (pass-through mode)."""
+    with patch("app.services.llm.jobs.start_high_priority_job") as mock_start_job:
+        mock_start_job.return_value = "test-task-id"
+
+        payload = LLMCallRequest(
+            query=QueryParams(input="Native API call test"),
+            config=LLMCallConfig(
+                blob=ConfigBlob(
+                    completion=NativeCompletionConfig(
+                        provider="openai-native",
+                        params={
+                            "model": "gpt-4",
+                            "temperature": 0.9,
+                            "max_tokens": 500,
+                            "top_p": 1.0,
+                        },
+                    )
+                )
+            ),
+        )
+
+        response = client.post(
+            "api/v1/llm/call",
+            json=payload.model_dump(mode="json"),
+            headers=user_api_key_header,
+        )
+
+        assert response.status_code == 200
+        response_data = response.json()
+        assert response_data["success"] is True
+        mock_start_job.assert_called_once()
+
+
+def test_llm_call_missing_config(
+    client: TestClient, user_api_key_header: dict[str, str]
+):
+    """Test LLM call with missing config fails validation."""
+    payload = {
+        "query": {"input": "Test query"},
+        # Missing config field
+    }
+
+    response = client.post(
+        "api/v1/llm/call",
+        json=payload,
+        headers=user_api_key_header,
+    )
+
+    assert response.status_code == 422  # Validation error
+
+
+def test_llm_call_invalid_provider(
+    client: TestClient, user_api_key_header: dict[str, str]
+):
+    """Test LLM call with invalid provider type."""
+    payload = {
+        "query": {"input": "Test query"},
+        "config": {
+            "blob": {
+                "completion": {
+                    "provider": "invalid-provider",
+                    "params": {"model": "gpt-4"},
+                }
+            }
+        },
+    }
+
+    response = client.post(
+        "api/v1/llm/call",
+        json=payload,
+        headers=user_api_key_header,
+    )
+
+    assert response.status_code == 422  # Validation error
diff --git a/backend/app/tests/api/routes/test_users.py b/backend/app/tests/api/routes/test_users.py
index 4b7c3fdee..d8d936e46 100644
--- a/backend/app/tests/api/routes/test_users.py
+++ b/backend/app/tests/api/routes/test_users.py
@@ -470,4 +470,4 @@ def test_delete_user_without_privileges(
         headers=normal_user_token_headers,
     )
     assert r.status_code == 403
-    assert r.json()["error"] == "The user doesn't have enough privileges"
+    assert r.json()["error"] == "Insufficient permissions - require superuser access."
diff --git a/backend/app/tests/core/test_security.py b/backend/app/tests/core/test_security.py
index 591013754..b21b82cec 100644
--- a/backend/app/tests/core/test_security.py
+++ b/backend/app/tests/core/test_security.py
@@ -4,8 +4,6 @@
 from app.core.security import (
     get_password_hash,
     verify_password,
-    encrypt_api_key,
-    decrypt_api_key,
     get_encryption_key,
     APIKeyManager,
 )
@@ -13,107 +11,6 @@
 from app.tests.utils.test_data import create_test_api_key
 
 
-def test_encrypt_decrypt_api_key():
-    """Test that API key encryption and decryption works correctly."""
-    # Test data
-    test_key = "ApiKey test123456789"
-
-    # Encrypt the key
-    encrypted_key = encrypt_api_key(test_key)
-
-    # Verify encryption worked
-    assert encrypted_key is not None
-    assert encrypted_key != test_key
-    assert isinstance(encrypted_key, str)
-
-    # Decrypt the key
-    decrypted_key = decrypt_api_key(encrypted_key)
-
-    # Verify decryption worked
-    assert decrypted_key is not None
-    assert decrypted_key == test_key
-
-
-def test_api_key_format_validation():
-    """Test that API key format is validated correctly."""
-    # Test valid API key format
-    valid_key = "ApiKey test123456789"
-    encrypted_valid = encrypt_api_key(valid_key)
-    assert encrypted_valid is not None
-    assert decrypt_api_key(encrypted_valid) == valid_key
-
-    # Test invalid API key format (missing prefix)
-    invalid_key = "test123456789"
-    encrypted_invalid = encrypt_api_key(invalid_key)
-    assert encrypted_invalid is not None
-    assert decrypt_api_key(encrypted_invalid) == invalid_key
-
-
-def test_encrypt_api_key_edge_cases():
-    """Test edge cases for API key encryption."""
-    # Test empty string
-    empty_key = ""
-    encrypted_empty = encrypt_api_key(empty_key)
-    assert encrypted_empty is not None
-    assert decrypt_api_key(encrypted_empty) == empty_key
-
-    # Test whitespace only
-    whitespace_key = "   "
-    encrypted_whitespace = encrypt_api_key(whitespace_key)
-    assert encrypted_whitespace is not None
-    assert decrypt_api_key(encrypted_whitespace) == whitespace_key
-
-    # Test very long input
-    long_key = "ApiKey " + "a" * 1000
-    encrypted_long = encrypt_api_key(long_key)
-    assert encrypted_long is not None
-    assert decrypt_api_key(encrypted_long) == long_key
-
-
-def test_encrypt_api_key_type_validation():
-    """Test type validation for API key encryption."""
-    # Test non-string inputs
-    invalid_inputs = [123, [], {}, True]
-    for invalid_input in invalid_inputs:
-        with pytest.raises(ValueError, match="Failed to encrypt API key"):
-            encrypt_api_key(invalid_input)
-
-
-def test_encrypt_api_key_security():
-    """Test security properties of API key encryption."""
-    # Test that same input produces different encrypted output
-    test_key = "ApiKey test123456789"
-    encrypted1 = encrypt_api_key(test_key)
-    encrypted2 = encrypt_api_key(test_key)
-    assert encrypted1 != encrypted2  # Different encrypted outputs for same input
-
-
-def test_encrypt_api_key_error_handling():
-    """Test error handling in encrypt_api_key."""
-    # Test with invalid input
-    with pytest.raises(ValueError, match="Failed to encrypt API key"):
-        encrypt_api_key(None)
-
-
-def test_decrypt_api_key_error_handling():
-    """Test error handling in decrypt_api_key."""
-    # Test with invalid input
-    with pytest.raises(ValueError, match="Failed to decrypt API key"):
-        decrypt_api_key(None)
-
-    # Test with various invalid encrypted data formats
-    invalid_encrypted_data = [
-        "invalid_encrypted_data",  # Not base64
-        "not_a_base64_string",  # Not base64
-        "a" * 44,  # Wrong length
-        "!" * 44,  # Invalid base64 chars
-        "aGVsbG8=",  # Valid base64 but not encrypted
-    ]
-    for invalid_data in invalid_encrypted_data:
-        with pytest.raises(ValueError, match="Failed to decrypt API key"):
-            decrypt_api_key(invalid_data)
-
-
 def test_get_encryption_key():
     """Test that encryption key generation works correctly."""
     # Get the encryption key
diff --git a/backend/app/tests/crud/config/test_config.py b/backend/app/tests/crud/config/test_config.py
index e7837b98a..e4f90e25e 100644
--- a/backend/app/tests/crud/config/test_config.py
+++ b/backend/app/tests/crud/config/test_config.py
@@ -10,6 +10,7 @@
     ConfigCreate,
     ConfigUpdate,
 )
+from app.models.llm.request import NativeCompletionConfig
 from app.crud.config import ConfigCrud
 from app.tests.utils.test_data import create_test_project, create_test_config
 from app.tests.utils.utils import random_lower_string
@@ -18,8 +19,8 @@
 @pytest.fixture
 def example_config_blob():
     return ConfigBlob(
-        completion=CompletionConfig(
-            provider="openai",
+        completion=NativeCompletionConfig(
+            provider="openai-native",
             params={
                 "model": "gpt-4",
                 "temperature": 0.8,
diff --git a/backend/app/tests/crud/config/test_version.py b/backend/app/tests/crud/config/test_version.py
index c3c4bd582..a24940b7d 100644
--- a/backend/app/tests/crud/config/test_version.py
+++ b/backend/app/tests/crud/config/test_version.py
@@ -3,7 +3,8 @@
 from sqlmodel import Session
 from fastapi import HTTPException
 
-from app.models import ConfigVersionCreate, ConfigBlob, CompletionConfig
+from app.models import ConfigVersionCreate, ConfigBlob
+from app.models.llm.request import NativeCompletionConfig
 from app.crud.config import ConfigVersionCrud
 from app.tests.utils.test_data import (
     create_test_project,
@@ -15,8 +16,8 @@
 @pytest.fixture
 def example_config_blob():
     return ConfigBlob(
-        completion=CompletionConfig(
-            provider="openai",
+        completion=NativeCompletionConfig(
+            provider="openai-native",
             params={
                 "model": "gpt-4",
                 "temperature": 0.8,
diff --git a/backend/app/tests/crud/evaluations/test_langfuse.py b/backend/app/tests/crud/evaluations/test_langfuse.py
index bd0a46606..f7713ba2d 100644
--- a/backend/app/tests/crud/evaluations/test_langfuse.py
+++ b/backend/app/tests/crud/evaluations/test_langfuse.py
@@ -415,8 +415,8 @@ def test_upload_dataset_to_langfuse_success(self, valid_items):
         # Verify dataset items were created (3 original * 5 duplicates = 15)
         assert mock_langfuse.create_dataset_item.call_count == 15
 
-        # Verify flush was called (once per original item + final flush = 4 times for 3 items)
-        assert mock_langfuse.flush.call_count == 4  # 3 items + 1 final
+        # Verify flush was called once (final flush)
+        assert mock_langfuse.flush.call_count == 1
 
     def test_upload_dataset_to_langfuse_duplication_metadata(self, valid_items):
         """Test that duplication metadata is included."""
@@ -483,8 +483,8 @@ def test_upload_dataset_to_langfuse_single_duplication(self, valid_items):
 
         assert total_items == 3  # 3 items * 1 duplication
         assert mock_langfuse.create_dataset_item.call_count == 3
-        # 3 items + 1 final flush
-        assert mock_langfuse.flush.call_count == 4
+        # final flush once
+        assert mock_langfuse.flush.call_count == 1
 
     def test_upload_dataset_to_langfuse_item_creation_error(self, valid_items):
         """Test that item creation errors are logged but don't stop processing."""
diff --git a/backend/app/tests/services/doctransformer/test_job/conftest.py b/backend/app/tests/services/doctransformer/test_job/conftest.py
index edba7ec9d..8787db17a 100644
--- a/backend/app/tests/services/doctransformer/test_job/conftest.py
+++ b/backend/app/tests/services/doctransformer/test_job/conftest.py
@@ -13,7 +13,7 @@
 from app.crud import get_project_by_id
 from app.services.doctransform import job
 from app.core.config import settings
-from app.models import Document, Project, UserProjectOrg
+from app.models import Document, Project, AuthContext
 from app.tests.utils.document import DocumentStore
 from app.tests.utils.auth import TestAuthContext
 
@@ -68,14 +68,12 @@ def fast_execute_job_func(
 
 
 @pytest.fixture
-def current_user(db: Session, user_api_key: TestAuthContext) -> UserProjectOrg:
+def current_user(db: Session, user_api_key: TestAuthContext) -> AuthContext:
     """Create a test user for testing."""
-    api_key = user_api_key
-    user = api_key.user
-    return UserProjectOrg(
-        **user.model_dump(),
-        project_id=api_key.project_id,
-        organization_id=api_key.organization_id
+    return AuthContext(
+        user=user_api_key.user,
+        organization=user_api_key.organization,
+        project=user_api_key.project,
     )
 
 
@@ -86,10 +84,8 @@ def background_tasks() -> BackgroundTasks:
 
 
 @pytest.fixture
-def test_document(
-    db: Session, current_user: UserProjectOrg
-) -> Tuple[Document, Project]:
+def test_document(db: Session, current_user: AuthContext) -> Tuple[Document, Project]:
     """Create a test document for the current user's project."""
-    store = DocumentStore(db, current_user.project_id)
-    project = get_project_by_id(session=db, project_id=current_user.project_id)
+    store = DocumentStore(db, current_user.project.id)
+    project = get_project_by_id(session=db, project_id=current_user.project.id)
     return store.put(), project
diff --git a/backend/app/tests/services/doctransformer/test_job/test_integration.py b/backend/app/tests/services/doctransformer/test_job/test_integration.py
index 3282b772c..936bf6efb 100644
--- a/backend/app/tests/services/doctransformer/test_job/test_integration.py
+++ b/backend/app/tests/services/doctransformer/test_job/test_integration.py
@@ -15,7 +15,6 @@
     Document,
     Project,
     TransformationStatus,
-    UserProjectOrg,
     DocTransformJobCreate,
 )
 from app.tests.services.doctransformer.test_job.utils import (
@@ -40,13 +39,6 @@ def test_execute_job_end_to_end_workflow(
         job_crud = DocTransformationJobCrud(session=db, project_id=project.id)
         job = job_crud.create(DocTransformJobCreate(source_document_id=document.id))
 
-        current_user = UserProjectOrg(
-            id=1,
-            email="test@example.com",
-            project_id=project.id,
-            organization_id=project.organization_id,
-        )
-
         with patch(
             "app.services.doctransform.job.start_low_priority_job",
             return_value="fake-task-id",
@@ -59,7 +51,7 @@ def test_execute_job_end_to_end_workflow(
 
             returned_job_id = start_job(
                 db=db,
-                current_user=current_user,
+                project_id=project.id,
                 job_id=job.id,
                 transformer_name="test",
                 target_format="markdown",
diff --git a/backend/app/tests/services/doctransformer/test_job/test_start_job.py b/backend/app/tests/services/doctransformer/test_job/test_start_job.py
index 60e3dadee..1922e6730 100644
--- a/backend/app/tests/services/doctransformer/test_job/test_start_job.py
+++ b/backend/app/tests/services/doctransformer/test_job/test_start_job.py
@@ -16,7 +16,7 @@
     DocTransformationJob,
     Project,
     TransformationStatus,
-    UserProjectOrg,
+    AuthContext,
     DocTransformJobCreate,
 )
 from app.tests.services.doctransformer.test_job.utils import (
@@ -36,13 +36,13 @@ def _create_job(self, db: Session, project_id: int, source_document_id):
     def test_start_job_success(
         self,
         db: Session,
-        current_user: UserProjectOrg,
+        current_user: AuthContext,
         test_document: tuple[Document, Project],
     ) -> None:
         """start_job should enqueue execute_job with correct kwargs and return the same job id."""
         document, _project = test_document
 
-        job = self._create_job(db, current_user.project_id, document.id)
+        job = self._create_job(db, current_user.project.id, document.id)
 
         with patch(
             "app.services.doctransform.job.start_low_priority_job"
@@ -51,7 +51,7 @@ def test_start_job_success(
 
             returned_job_id = start_job(
                 db=db,
-                current_user=current_user,
+                project_id=current_user.project.id,
                 job_id=job.id,
                 transformer_name="test-transformer",
                 target_format="markdown",
@@ -70,7 +70,7 @@ def test_start_job_success(
         mock_schedule.assert_called_once()
         kwargs = mock_schedule.call_args.kwargs
         assert kwargs["function_path"] == "app.services.doctransform.job.execute_job"
-        assert kwargs["project_id"] == current_user.project_id
+        assert kwargs["project_id"] == current_user.project.id
         assert kwargs["job_id"] == str(job.id)
         assert kwargs["source_document_id"] == str(job.source_document_id)
         assert kwargs["transformer_name"] == "test-transformer"
@@ -80,7 +80,7 @@ def test_start_job_success(
     def test_start_job_with_nonexistent_document(
         self,
         db: Session,
-        current_user: UserProjectOrg,
+        current_user: AuthContext,
     ) -> None:
         """
         Previously: start_job validated document and raised 404.
@@ -95,7 +95,7 @@ def test_start_job_with_nonexistent_document(
                 mock_schedule.return_value = "fake-task-id"
                 start_job(
                     db=db,
-                    current_user=current_user,
+                    project_id=current_user.project.id,
                     job_id=nonexistent_job_id,
                     transformer_name="test-transformer",
                     target_format="markdown",
@@ -105,7 +105,7 @@ def test_start_job_with_nonexistent_document(
     def test_start_job_with_different_formats(
         self,
         db: Session,
-        current_user: UserProjectOrg,
+        current_user: AuthContext,
         test_document: tuple[Document, Project],
         monkeypatch,
     ) -> None:
@@ -121,11 +121,11 @@ def test_start_job_with_different_formats(
             mock_schedule.return_value = "fake-task-id"
 
             for target_format in formats:
-                job = self._create_job(db, current_user.project_id, document.id)
+                job = self._create_job(db, current_user.project.id, document.id)
 
                 returned_job_id = start_job(
                     db=db,
-                    current_user=current_user,
+                    project_id=current_user.project.id,
                     job_id=job.id,
                     transformer_name="test",
                     target_format=target_format,
@@ -144,7 +144,7 @@ def test_start_job_with_different_formats(
                     kwargs["function_path"]
                     == "app.services.doctransform.job.execute_job"
                 )
-                assert kwargs["project_id"] == current_user.project_id
+                assert kwargs["project_id"] == current_user.project.id
                 assert kwargs["job_id"] == str(job.id)
                 assert kwargs["source_document_id"] == str(job.source_document_id)
                 assert kwargs["transformer_name"] == "test"
@@ -154,7 +154,7 @@ def test_start_job_with_different_formats(
     def test_start_job_with_different_transformers(
         self,
         db: Session,
-        current_user: UserProjectOrg,
+        current_user: AuthContext,
         test_document: tuple[Document, Project],
         transformer_name: str,
         monkeypatch,
@@ -163,7 +163,7 @@ def test_start_job_with_different_transformers(
         monkeypatch.setitem(TRANSFORMERS, "test", MockTestTransformer)
 
         document, _ = test_document
-        job = self._create_job(db, current_user.project_id, document.id)
+        job = self._create_job(db, current_user.project.id, document.id)
 
         with patch(
             "app.services.doctransform.job.start_low_priority_job"
@@ -172,7 +172,7 @@ def test_start_job_with_different_transformers(
 
             returned_job_id = start_job(
                 db=db,
-                current_user=current_user,
+                project_id=current_user.project.id,
                 job_id=job.id,
                 transformer_name=transformer_name,
                 target_format="markdown",
@@ -187,7 +187,7 @@ def test_start_job_with_different_transformers(
         assert kwargs["transformer_name"] == transformer_name
         assert kwargs["target_format"] == "markdown"
         assert kwargs["function_path"] == "app.services.doctransform.job.execute_job"
-        assert kwargs["project_id"] == current_user.project_id
+        assert kwargs["project_id"] == current_user.project.id
         assert kwargs["job_id"] == str(job.id)
         assert kwargs["source_document_id"] == str(job.source_document_id)
         assert returned_job_id == job.id
diff --git a/backend/app/tests/services/llm/providers/test_openai.py b/backend/app/tests/services/llm/providers/test_openai.py
index c216ca540..745dd00b8 100644
--- a/backend/app/tests/services/llm/providers/test_openai.py
+++ b/backend/app/tests/services/llm/providers/test_openai.py
@@ -7,7 +7,7 @@
 import openai
 
 from app.models.llm import (
-    CompletionConfig,
+    NativeCompletionConfig,
     QueryParams,
 )
 from app.models.llm.request import ConversationConfig
@@ -31,8 +31,8 @@ def provider(self, mock_client):
     @pytest.fixture
     def completion_config(self):
         """Create a basic completion config."""
-        return CompletionConfig(
-            provider="openai",
+        return NativeCompletionConfig(
+            provider="openai-native",
             params={"model": "gpt-4"},
         )
 
@@ -59,7 +59,7 @@ def test_execute_success_without_conversation(
         assert result is not None
         assert result.response.output.text == mock_response.output_text
         assert result.response.model == mock_response.model
-        assert result.response.provider == "openai"
+        assert result.response.provider == "openai-native"
         assert result.response.conversation_id is None
         assert result.usage.input_tokens == mock_response.usage.input_tokens
         assert result.usage.output_tokens == mock_response.usage.output_tokens
@@ -233,8 +233,8 @@ def test_execute_with_conversation_parameter_removed_when_no_config(
     ):
         """Test that conversation param is removed if it exists in config but no conversation config."""
         # Create a config with conversation in params (should be removed)
-        completion_config = CompletionConfig(
-            provider="openai",
+        completion_config = NativeCompletionConfig(
+            provider="openai-native",
             params={"model": "gpt-4", "conversation": {"id": "old_conv"}},
         )
 
diff --git a/backend/app/tests/services/llm/providers/test_registry.py b/backend/app/tests/services/llm/providers/test_registry.py
index f9c595c0d..c05222747 100644
--- a/backend/app/tests/services/llm/providers/test_registry.py
+++ b/backend/app/tests/services/llm/providers/test_registry.py
@@ -21,8 +21,8 @@ class TestProviderRegistry:
 
     def test_registry_contains_openai(self):
         """Test that registry contains OpenAI provider."""
-        assert "openai" in LLMProvider._registry
-        assert LLMProvider._registry["openai"] == OpenAIProvider
+        assert "openai-native" in LLMProvider._registry
+        assert LLMProvider._registry["openai-native"] == OpenAIProvider
 
     def test_registry_values_are_provider_classes(self):
         """Test that all registry values are BaseProvider subclasses."""
@@ -46,7 +46,7 @@ def test_get_llm_provider_with_openai(self, db: Session):
 
             provider = get_llm_provider(
                 session=db,
-                provider_type="openai",
+                provider_type="openai-native",
                 project_id=project.id,
                 organization_id=project.organization_id,
             )
@@ -64,7 +64,7 @@ def test_get_llm_provider_with_openai(self, db: Session):
             with pytest.raises(ValueError) as exc_info:
                 get_llm_provider(
                     session=db,
-                    provider_type="openai",
+                    provider_type="openai-native",
                     project_id=project.id,
                     organization_id=project.organization_id,
                 )
@@ -87,7 +87,7 @@ def test_get_llm_provider_with_invalid_provider(self, db: Session):
         error_message = str(exc_info.value)
         assert "invalid_provider" in error_message
         assert "is not supported" in error_message
-        assert "openai" in error_message
+        assert "openai-native" in error_message
 
     def test_get_llm_provider_with_missing_credentials(self, db: Session):
         """Test handling of errors when credentials are not found."""
@@ -101,7 +101,7 @@ def test_get_llm_provider_with_missing_credentials(self, db: Session):
             with pytest.raises(ValueError) as exc_info:
                 get_llm_provider(
                     session=db,
-                    provider_type="openai",
+                    provider_type="openai-native",
                     project_id=project.id,
                     organization_id=project.organization_id,
                 )
diff --git a/backend/app/tests/services/llm/test_jobs.py b/backend/app/tests/services/llm/test_jobs.py
index 2f08b40c0..c71179c30 100644
--- a/backend/app/tests/services/llm/test_jobs.py
+++ b/backend/app/tests/services/llm/test_jobs.py
@@ -13,12 +13,14 @@
 from app.models import ConfigVersion, JobStatus, JobType
 from app.models.llm import (
     LLMCallRequest,
-    CompletionConfig,
+    NativeCompletionConfig,
     QueryParams,
     LLMCallResponse,
     LLMResponse,
     LLMOutput,
     Usage,
+    KaapiLLMParams,
+    KaapiCompletionConfig,
 )
 from app.models.llm.request import ConfigBlob, LLMCallConfig
 from app.services.llm.jobs import (
@@ -40,8 +42,8 @@ def llm_call_request(self):
             query=QueryParams(input="Test query"),
             config=LLMCallConfig(
                 blob=ConfigBlob(
-                    completion=CompletionConfig(
-                        provider="openai",
+                    completion=NativeCompletionConfig(
+                        provider="openai-native",
                         params={"model": "gpt-4"},
                     )
                 )
@@ -225,7 +227,10 @@ def request_data(self):
             "query": {"input": "Test query"},
             "config": {
                 "blob": {
-                    "completion": {"provider": "openai", "params": {"model": "gpt-4"}}
+                    "completion": {
+                        "provider": "openai-native",
+                        "params": {"model": "gpt-4"},
+                    }
                 }
             },
             "include_provider_raw_response": False,
@@ -396,8 +401,8 @@ def test_stored_config_success(self, db, job_for_execution, mock_llm_response):
 
         # Create a real config in the database
         config_blob = ConfigBlob(
-            completion=CompletionConfig(
-                provider="openai",
+            completion=NativeCompletionConfig(
+                provider="openai-native",
                 params={"model": "gpt-4", "temperature": 0.7},
             )
         )
@@ -445,8 +450,8 @@ def test_stored_config_with_callback(
         project = get_project(db)
 
         config_blob = ConfigBlob(
-            completion=CompletionConfig(
-                provider="openai",
+            completion=NativeCompletionConfig(
+                provider="openai-native",
                 params={"model": "gpt-3.5-turbo", "temperature": 0.5},
             )
         )
@@ -493,8 +498,8 @@ def test_stored_config_version_not_found(self, db, job_for_execution):
         project = get_project(db)
 
         config_blob = ConfigBlob(
-            completion=CompletionConfig(
-                provider="openai",
+            completion=NativeCompletionConfig(
+                provider="openai-native",
                 params={"model": "gpt-4"},
             )
         )
@@ -523,6 +528,200 @@ def test_stored_config_version_not_found(self, db, job_for_execution):
             db.refresh(job_for_execution)
             assert job_for_execution.status == JobStatus.FAILED
 
+    def test_kaapi_config_success(self, db, job_for_execution, mock_llm_response):
+        """Test successful execution with Kaapi abstracted config."""
+        project = get_project(db)
+
+        config_blob = ConfigBlob(
+            completion=KaapiCompletionConfig(
+                provider="openai",
+                params=KaapiLLMParams(
+                    model="gpt-4",
+                    temperature=0.7,
+                    instructions="You are a helpful assistant",
+                ),
+            )
+        )
+        config = create_test_config(db, project_id=project.id, config_blob=config_blob)
+        db.commit()
+
+        kaapi_request_data = {
+            "query": {"input": "Test query with Kaapi config"},
+            "config": {
+                "id": str(config.id),
+                "version": 1,
+            },
+            "include_provider_raw_response": False,
+            "callback_url": None,
+        }
+
+        with (
+            patch("app.services.llm.jobs.Session") as mock_session_class,
+            patch("app.services.llm.jobs.get_llm_provider") as mock_get_provider,
+        ):
+            mock_session_class.return_value.__enter__.return_value = db
+            mock_session_class.return_value.__exit__.return_value = None
+
+            mock_provider = MagicMock()
+            mock_provider.execute.return_value = (mock_llm_response, None)
+            mock_get_provider.return_value = mock_provider
+
+            result = self._execute_job(job_for_execution, db, kaapi_request_data)
+
+            mock_get_provider.assert_called_once()
+            mock_provider.execute.assert_called_once()
+            assert result["success"]
+            db.refresh(job_for_execution)
+            assert job_for_execution.status == JobStatus.SUCCESS
+
+    def test_kaapi_config_with_callback(self, db, job_for_execution, mock_llm_response):
+        """Test successful execution with Kaapi config and callback."""
+        project = get_project(db)
+
+        config_blob = ConfigBlob(
+            completion=KaapiCompletionConfig(
+                provider="openai",
+                params=KaapiLLMParams(
+                    model="gpt-3.5-turbo",
+                    temperature=0.5,
+                ),
+            )
+        )
+        config = create_test_config(db, project_id=project.id, config_blob=config_blob)
+        db.commit()
+
+        kaapi_request_data = {
+            "query": {"input": "Test query with Kaapi config and callback"},
+            "config": {
+                "id": str(config.id),
+                "version": 1,
+            },
+            "include_provider_raw_response": False,
+            "callback_url": "https://example.com/callback",
+        }
+
+        with (
+            patch("app.services.llm.jobs.Session") as mock_session_class,
+            patch("app.services.llm.jobs.get_llm_provider") as mock_get_provider,
+            patch("app.services.llm.jobs.send_callback") as mock_send_callback,
+        ):
+            mock_session_class.return_value.__enter__.return_value = db
+            mock_session_class.return_value.__exit__.return_value = None
+
+            mock_provider = MagicMock()
+            mock_provider.execute.return_value = (mock_llm_response, None)
+            mock_get_provider.return_value = mock_provider
+
+            result = self._execute_job(job_for_execution, db, kaapi_request_data)
+
+            mock_send_callback.assert_called_once()
+            callback_data = mock_send_callback.call_args[1]["data"]
+            assert callback_data["success"]
+            assert result["success"]
+            db.refresh(job_for_execution)
+            assert job_for_execution.status == JobStatus.SUCCESS
+
+    def test_kaapi_config_warnings_passed_through_metadata(
+        self, db, job_for_execution, mock_llm_response
+    ):
+        """Test that warnings from Kaapi config transformation are passed through in metadata."""
+        project = get_project(db)
+
+        # Use a config that will generate warnings (temperature on reasoning model)
+        config_blob = ConfigBlob(
+            completion=KaapiCompletionConfig(
+                provider="openai",
+                params=KaapiLLMParams(
+                    model="o1",  # Reasoning model
+                    temperature=0.7,  # This will be suppressed with warning
+                ),
+            )
+        )
+        config = create_test_config(db, project_id=project.id, config_blob=config_blob)
+        db.commit()
+
+        kaapi_request_data = {
+            "query": {"input": "Test query"},
+            "config": {
+                "id": str(config.id),
+                "version": 1,
+            },
+            "include_provider_raw_response": False,
+            "callback_url": None,
+        }
+
+        with (
+            patch("app.services.llm.jobs.Session") as mock_session_class,
+            patch("app.services.llm.jobs.get_llm_provider") as mock_get_provider,
+        ):
+            mock_session_class.return_value.__enter__.return_value = db
+            mock_session_class.return_value.__exit__.return_value = None
+
+            mock_provider = MagicMock()
+            mock_provider.execute.return_value = (mock_llm_response, None)
+            mock_get_provider.return_value = mock_provider
+
+            result = self._execute_job(job_for_execution, db, kaapi_request_data)
+
+            # Verify the result includes warnings in metadata
+            assert result["success"]
+            assert "metadata" in result
+            assert "warnings" in result["metadata"]
+            assert len(result["metadata"]["warnings"]) == 1
+            assert "temperature" in result["metadata"]["warnings"][0].lower()
+            assert "suppressed" in result["metadata"]["warnings"][0]
+
+    def test_kaapi_config_warnings_merged_with_existing_metadata(
+        self, db, job_for_execution, mock_llm_response
+    ):
+        """Test that warnings are merged with existing request metadata."""
+        project = get_project(db)
+
+        config_blob = ConfigBlob(
+            completion=KaapiCompletionConfig(
+                provider="openai",
+                params=KaapiLLMParams(
+                    model="gpt-4",  # Non-reasoning model
+                    reasoning="high",  # This will be suppressed with warning
+                ),
+            )
+        )
+        config = create_test_config(db, project_id=project.id, config_blob=config_blob)
+        db.commit()
+
+        kaapi_request_data = {
+            "query": {"input": "Test query"},
+            "config": {
+                "id": str(config.id),
+                "version": 1,
+            },
+            "include_provider_raw_response": False,
+            "callback_url": None,
+            "request_metadata": {"tracking_id": "test-123"},
+        }
+
+        with (
+            patch("app.services.llm.jobs.Session") as mock_session_class,
+            patch("app.services.llm.jobs.get_llm_provider") as mock_get_provider,
+        ):
+            mock_session_class.return_value.__enter__.return_value = db
+            mock_session_class.return_value.__exit__.return_value = None
+
+            mock_provider = MagicMock()
+            mock_provider.execute.return_value = (mock_llm_response, None)
+            mock_get_provider.return_value = mock_provider
+
+            result = self._execute_job(job_for_execution, db, kaapi_request_data)
+
+            # Verify warnings are added to existing metadata
+            assert result["success"]
+            assert "metadata" in result
+            assert result["metadata"]["tracking_id"] == "test-123"
+            assert "warnings" in result["metadata"]
+            assert len(result["metadata"]["warnings"]) == 1
+            assert "reasoning" in result["metadata"]["warnings"][0].lower()
+            assert "does not support reasoning" in result["metadata"]["warnings"][0]
+
 
 class TestResolveConfigBlob:
     """Test suite for resolve_config_blob function."""
@@ -532,8 +731,8 @@ def test_resolve_config_blob_success(self, db: Session):
         project = get_project(db)
 
         config_blob = ConfigBlob(
-            completion=CompletionConfig(
-                provider="openai",
+            completion=NativeCompletionConfig(
+                provider="openai-native",
                 params={"model": "gpt-4", "temperature": 0.8},
             )
         )
@@ -549,7 +748,7 @@ def test_resolve_config_blob_success(self, db: Session):
 
         assert error is None
         assert resolved_blob is not None
-        assert resolved_blob.completion.provider == "openai"
+        assert resolved_blob.completion.provider == "openai-native"
         assert resolved_blob.completion.params["model"] == "gpt-4"
         assert resolved_blob.completion.params["temperature"] == 0.8
 
@@ -558,8 +757,8 @@ def test_resolve_config_blob_version_not_found(self, db: Session):
         project = get_project(db)
 
         config_blob = ConfigBlob(
-            completion=CompletionConfig(
-                provider="openai",
+            completion=NativeCompletionConfig(
+                provider="openai-native",
                 params={"model": "gpt-4"},
             )
         )
@@ -583,8 +782,8 @@ def test_resolve_config_blob_invalid_blob_data(self, db: Session):
         project = get_project(db)
 
         config_blob = ConfigBlob(
-            completion=CompletionConfig(
-                provider="openai",
+            completion=NativeCompletionConfig(
+                provider="openai-native",
                 params={"model": "gpt-4"},
             )
         )
@@ -620,8 +819,8 @@ def test_resolve_config_blob_with_multiple_versions(self, db: Session):
 
         # Create a config with version 1
         config_blob_v1 = ConfigBlob(
-            completion=CompletionConfig(
-                provider="openai",
+            completion=NativeCompletionConfig(
+                provider="openai-native",
                 params={"model": "gpt-3.5-turbo", "temperature": 0.5},
             )
         )
@@ -635,8 +834,8 @@ def test_resolve_config_blob_with_multiple_versions(self, db: Session):
             session=db, project_id=project.id, config_id=config.id
         )
         config_blob_v2 = ConfigBlob(
-            completion=CompletionConfig(
-                provider="openai",
+            completion=NativeCompletionConfig(
+                provider="openai-native",
                 params={"model": "gpt-4", "temperature": 0.9},
             )
         )
@@ -668,3 +867,92 @@ def test_resolve_config_blob_with_multiple_versions(self, db: Session):
         assert resolved_blob_v2 is not None
         assert resolved_blob_v2.completion.params["model"] == "gpt-4"
         assert resolved_blob_v2.completion.params["temperature"] == 0.9
+
+    def test_resolve_kaapi_config_blob_success(self, db: Session):
+        """Test successful resolution of stored Kaapi config blob."""
+        project = get_project(db)
+
+        config_blob = ConfigBlob(
+            completion=KaapiCompletionConfig(
+                provider="openai",
+                params=KaapiLLMParams(
+                    model="gpt-4",
+                    temperature=0.8,
+                    instructions="You are a helpful assistant",
+                ),
+            )
+        )
+        config = create_test_config(db, project_id=project.id, config_blob=config_blob)
+        db.commit()
+
+        config_crud = ConfigVersionCrud(
+            session=db, project_id=project.id, config_id=config.id
+        )
+        llm_call_config = LLMCallConfig(id=str(config.id), version=1)
+
+        resolved_blob, error = resolve_config_blob(config_crud, llm_call_config)
+
+        assert error is None
+        assert resolved_blob is not None
+        assert isinstance(resolved_blob.completion, KaapiCompletionConfig)
+        assert resolved_blob.completion.provider == "openai"
+        assert resolved_blob.completion.params.model == "gpt-4"
+        assert resolved_blob.completion.params.temperature == 0.8
+        assert (
+            resolved_blob.completion.params.instructions
+            == "You are a helpful assistant"
+        )
+
+    def test_resolve_both_native_and_kaapi_configs(self, db: Session):
+        """Test that both native and Kaapi configs can be resolved correctly."""
+        project = get_project(db)
+
+        # Create native config
+        native_blob = ConfigBlob(
+            completion=NativeCompletionConfig(
+                provider="openai-native",
+                params={"model": "gpt-3.5-turbo", "temperature": 0.5},
+            )
+        )
+        native_config = create_test_config(
+            db, project_id=project.id, config_blob=native_blob, use_kaapi_schema=False
+        )
+
+        # Create Kaapi config
+        kaapi_blob = ConfigBlob(
+            completion=KaapiCompletionConfig(
+                provider="openai",
+                params=KaapiLLMParams(
+                    model="gpt-4",
+                    temperature=0.7,
+                ),
+            )
+        )
+        kaapi_config = create_test_config(
+            db, project_id=project.id, config_blob=kaapi_blob, use_kaapi_schema=True
+        )
+        db.commit()
+
+        # Test native config resolution
+        native_crud = ConfigVersionCrud(
+            session=db, project_id=project.id, config_id=native_config.id
+        )
+        native_call_config = LLMCallConfig(id=str(native_config.id), version=1)
+        resolved_native, error_native = resolve_config_blob(
+            native_crud, native_call_config
+        )
+
+        assert error_native is None
+        assert isinstance(resolved_native.completion, NativeCompletionConfig)
+        assert resolved_native.completion.provider == "openai-native"
+
+        # Test Kaapi config resolution
+        kaapi_crud = ConfigVersionCrud(
+            session=db, project_id=project.id, config_id=kaapi_config.id
+        )
+        kaapi_call_config = LLMCallConfig(id=str(kaapi_config.id), version=1)
+        resolved_kaapi, error_kaapi = resolve_config_blob(kaapi_crud, kaapi_call_config)
+
+        assert error_kaapi is None
+        assert isinstance(resolved_kaapi.completion, KaapiCompletionConfig)
+        assert resolved_kaapi.completion.provider == "openai"
diff --git a/backend/app/tests/services/llm/test_mappers.py b/backend/app/tests/services/llm/test_mappers.py
new file mode 100644
index 000000000..c020753d2
--- /dev/null
+++ b/backend/app/tests/services/llm/test_mappers.py
@@ -0,0 +1,316 @@
+"""
+Unit tests for LLM parameter mapping functions.
+
+Tests the transformation of Kaapi-abstracted parameters to provider-native formats.
+"""
+import pytest
+
+from app.models.llm import KaapiLLMParams, KaapiCompletionConfig, NativeCompletionConfig
+from app.services.llm.mappers import (
+    map_kaapi_to_openai_params,
+    transform_kaapi_config_to_native,
+)
+
+
+class TestMapKaapiToOpenAIParams:
+    """Test cases for map_kaapi_to_openai_params function."""
+
+    def test_basic_model_mapping(self):
+        """Test basic model parameter mapping."""
+        kaapi_params = KaapiLLMParams(model="gpt-4o")
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result == {"model": "gpt-4o"}
+        assert warnings == []
+
+    def test_instructions_mapping(self):
+        """Test instructions parameter mapping."""
+        kaapi_params = KaapiLLMParams(
+            model="gpt-4",
+            instructions="You are a helpful assistant.",
+        )
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result["model"] == "gpt-4"
+        assert result["instructions"] == "You are a helpful assistant."
+        assert warnings == []
+
+    def test_temperature_mapping(self):
+        """Test temperature parameter mapping for non-reasoning models."""
+        kaapi_params = KaapiLLMParams(
+            model="gpt-4",
+            temperature=0.7,
+        )
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result["model"] == "gpt-4"
+        assert result["temperature"] == 0.7
+        assert warnings == []
+
+    def test_temperature_zero_mapping(self):
+        """Test that temperature=0 is correctly mapped (edge case)."""
+        kaapi_params = KaapiLLMParams(
+            model="gpt-4",
+            temperature=0.0,
+        )
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result["temperature"] == 0.0
+        assert warnings == []
+
+    def test_reasoning_mapping_for_reasoning_models(self):
+        """Test reasoning parameter mapping to OpenAI format for reasoning-capable models."""
+        kaapi_params = KaapiLLMParams(
+            model="o1",
+            reasoning="high",
+        )
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result["model"] == "o1"
+        assert result["reasoning"] == {"effort": "high"}
+        assert warnings == []
+
+    def test_knowledge_base_ids_mapping(self):
+        """Test knowledge_base_ids mapping to OpenAI tools format."""
+        kaapi_params = KaapiLLMParams(
+            model="gpt-4",
+            knowledge_base_ids=["vs_abc123", "vs_def456"],
+        )
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result["model"] == "gpt-4"
+        assert "tools" in result
+        assert len(result["tools"]) == 1
+        assert result["tools"][0]["type"] == "file_search"
+        assert result["tools"][0]["vector_store_ids"] == ["vs_abc123", "vs_def456"]
+        assert result["tools"][0]["max_num_results"] == 20  # default
+        assert warnings == []
+
+    def test_knowledge_base_with_max_num_results(self):
+        """Test knowledge_base_ids with custom max_num_results."""
+        kaapi_params = KaapiLLMParams(
+            model="gpt-4",
+            knowledge_base_ids=["vs_abc123"],
+            max_num_results=50,
+        )
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result["tools"][0]["max_num_results"] == 50
+        assert warnings == []
+
+    def test_complete_parameter_mapping(self):
+        """Test mapping all compatible parameters together."""
+        kaapi_params = KaapiLLMParams(
+            model="gpt-4o",
+            instructions="You are an expert assistant.",
+            temperature=0.8,
+            knowledge_base_ids=["vs_123"],
+            max_num_results=30,
+        )
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result["model"] == "gpt-4o"
+        assert result["instructions"] == "You are an expert assistant."
+        assert result["temperature"] == 0.8
+        assert result["tools"][0]["type"] == "file_search"
+        assert result["tools"][0]["vector_store_ids"] == ["vs_123"]
+        assert result["tools"][0]["max_num_results"] == 30
+        assert warnings == []
+
+    def test_reasoning_suppressed_for_non_reasoning_models(self):
+        """Test that reasoning is suppressed with warning for non-reasoning models."""
+        kaapi_params = KaapiLLMParams(
+            model="gpt-4",
+            reasoning="high",
+        )
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result["model"] == "gpt-4"
+        assert "reasoning" not in result
+        assert len(warnings) == 1
+        assert "reasoning" in warnings[0].lower()
+        assert "does not support reasoning" in warnings[0]
+
+    def test_temperature_suppressed_for_reasoning_models(self):
+        """Test that temperature is suppressed with warning for reasoning models when reasoning is set."""
+        kaapi_params = KaapiLLMParams(
+            model="o1",
+            temperature=0.7,
+            reasoning="high",
+        )
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result["model"] == "o1"
+        assert result["reasoning"] == {"effort": "high"}
+        assert "temperature" not in result
+        assert len(warnings) == 1
+        assert "temperature" in warnings[0].lower()
+        assert "suppressed" in warnings[0]
+
+    def test_temperature_without_reasoning_for_reasoning_models(self):
+        """Test that temperature is suppressed for reasoning models even without explicit reasoning parameter."""
+        kaapi_params = KaapiLLMParams(
+            model="o1",
+            temperature=0.7,
+        )
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result["model"] == "o1"
+        assert "temperature" not in result
+        assert "reasoning" not in result
+        assert len(warnings) == 1
+        assert "temperature" in warnings[0].lower()
+        assert "suppressed" in warnings[0]
+
+    def test_minimal_params(self):
+        """Test mapping with minimal parameters (only model)."""
+        kaapi_params = KaapiLLMParams(model="gpt-4")
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result == {"model": "gpt-4"}
+        assert warnings == []
+
+    def test_only_knowledge_base_ids(self):
+        """Test mapping with only knowledge_base_ids and model."""
+        kaapi_params = KaapiLLMParams(
+            model="gpt-4",
+            knowledge_base_ids=["vs_xyz"],
+        )
+
+        result, warnings = map_kaapi_to_openai_params(kaapi_params)
+
+        assert result["model"] == "gpt-4"
+        assert "tools" in result
+        assert result["tools"][0]["vector_store_ids"] == ["vs_xyz"]
+        assert warnings == []
+
+
+class TestTransformKaapiConfigToNative:
+    """Test cases for transform_kaapi_config_to_native function."""
+
+    def test_transform_openai_config(self):
+        """Test transformation of Kaapi OpenAI config to native format."""
+        kaapi_config = KaapiCompletionConfig(
+            provider="openai",
+            params=KaapiLLMParams(
+                model="gpt-4",
+                temperature=0.7,
+            ),
+        )
+
+        result, warnings = transform_kaapi_config_to_native(kaapi_config)
+
+        assert isinstance(result, NativeCompletionConfig)
+        assert result.provider == "openai-native"
+        assert result.params["model"] == "gpt-4"
+        assert result.params["temperature"] == 0.7
+        assert warnings == []
+
+    def test_transform_with_all_params(self):
+        """Test transformation with all Kaapi parameters."""
+        kaapi_config = KaapiCompletionConfig(
+            provider="openai",
+            params=KaapiLLMParams(
+                model="gpt-4o",
+                instructions="System prompt here",
+                temperature=0.5,
+                knowledge_base_ids=["vs_abc"],
+                max_num_results=25,
+            ),
+        )
+
+        result, warnings = transform_kaapi_config_to_native(kaapi_config)
+
+        assert result.provider == "openai-native"
+        assert result.params["model"] == "gpt-4o"
+        assert result.params["instructions"] == "System prompt here"
+        assert result.params["temperature"] == 0.5
+        assert result.params["tools"][0]["type"] == "file_search"
+        assert result.params["tools"][0]["max_num_results"] == 25
+        assert warnings == []
+
+    def test_transform_with_reasoning(self):
+        """Test transformation with reasoning parameter for reasoning-capable models."""
+        kaapi_config = KaapiCompletionConfig(
+            provider="openai",
+            params=KaapiLLMParams(
+                model="o1",
+                reasoning="medium",
+            ),
+        )
+
+        result, warnings = transform_kaapi_config_to_native(kaapi_config)
+
+        assert result.provider == "openai-native"
+        assert result.params["model"] == "o1"
+        assert result.params["reasoning"] == {"effort": "medium"}
+        assert warnings == []
+
+    def test_transform_with_both_temperature_and_reasoning(self):
+        """Test that transformation handles temperature + reasoning intelligently for reasoning models."""
+        kaapi_config = KaapiCompletionConfig(
+            provider="openai",
+            params=KaapiLLMParams(
+                model="o1",
+                temperature=0.7,
+                reasoning="high",
+            ),
+        )
+
+        result, warnings = transform_kaapi_config_to_native(kaapi_config)
+
+        assert result.provider == "openai-native"
+        assert result.params["model"] == "o1"
+        assert result.params["reasoning"] == {"effort": "high"}
+        assert "temperature" not in result.params
+        assert len(warnings) == 1
+        assert "temperature" in warnings[0].lower()
+        assert "suppressed" in warnings[0]
+
+    def test_unsupported_provider_raises_error(self):
+        """Test that unsupported providers raise ValueError."""
+        # Note: This would require modifying KaapiCompletionConfig to accept other providers
+        # For now, this tests the error handling in the mapper
+        # We'll create a mock config that bypasses validation
+        from unittest.mock import MagicMock
+
+        mock_config = MagicMock()
+        mock_config.provider = "unsupported-provider"
+        mock_config.params = KaapiLLMParams(model="some-model")
+
+        with pytest.raises(ValueError) as exc_info:
+            transform_kaapi_config_to_native(mock_config)
+
+        assert "Unsupported provider" in str(exc_info.value)
+
+    def test_transform_preserves_param_structure(self):
+        """Test that transformation correctly structures nested parameters."""
+        kaapi_config = KaapiCompletionConfig(
+            provider="openai",
+            params=KaapiLLMParams(
+                model="gpt-4",
+                knowledge_base_ids=["vs_1", "vs_2", "vs_3"],
+                max_num_results=15,
+            ),
+        )
+
+        result, warnings = transform_kaapi_config_to_native(kaapi_config)
+
+        # Verify the nested structure is correct
+        assert isinstance(result.params["tools"], list)
+        assert isinstance(result.params["tools"][0], dict)
+        assert isinstance(result.params["tools"][0]["vector_store_ids"], list)
+        assert len(result.params["tools"][0]["vector_store_ids"]) == 3
+        assert warnings == []
diff --git a/backend/app/tests/utils/test_data.py b/backend/app/tests/utils/test_data.py
index b33656b22..bc871dbcc 100644
--- a/backend/app/tests/utils/test_data.py
+++ b/backend/app/tests/utils/test_data.py
@@ -9,7 +9,6 @@
     OrganizationCreate,
     ProjectCreate,
     ConfigBlob,
-    CompletionConfig,
     CredsCreate,
     FineTuningJobCreate,
     Fine_Tuning,
@@ -22,6 +21,7 @@
     ConfigVersionCreate,
     ConfigBase,
 )
+from app.models.llm import KaapiLLMParams, KaapiCompletionConfig, NativeCompletionConfig
 from app.crud import (
     create_organization,
     create_project,
@@ -242,11 +242,20 @@ def create_test_config(
     name: str | None = None,
     description: str | None = None,
     config_blob: ConfigBlob | None = None,
+    use_kaapi_schema: bool = False,
 ) -> Config:
     """
     Creates and returns a test configuration with an initial version.
 
     Persists the config and version to the database.
+
+    Args:
+        db: Database session
+        project_id: Project ID (creates new project if None)
+        name: Config name (generates random if None)
+        description: Config description
+        config_blob: Config blob (creates default if None)
+        use_kaapi_schema: If True, creates Kaapi-format config; if False, creates native format
     """
     if project_id is None:
         project = create_test_project(db)
@@ -256,16 +265,29 @@ def create_test_config(
         name = f"test-config-{random_lower_string()}"
 
     if config_blob is None:
-        config_blob = ConfigBlob(
-            completion=CompletionConfig(
-                provider="openai",
-                params={
-                    "model": "gpt-4",
-                    "temperature": 0.7,
-                    "max_tokens": 1000,
-                },
+        if use_kaapi_schema:
+            # Create Kaapi-format config
+            config_blob = ConfigBlob(
+                completion=KaapiCompletionConfig(
+                    provider="openai",
+                    params=KaapiLLMParams(
+                        model="gpt-4",
+                        temperature=0.7,
+                    ),
+                )
+            )
+        else:
+            # Create native-format config
+            config_blob = ConfigBlob(
+                completion=NativeCompletionConfig(
+                    provider="openai-native",
+                    params={
+                        "model": "gpt-4",
+                        "temperature": 0.7,
+                        "max_tokens": 1000,
+                    },
+                )
             )
-        )
 
     config_create = ConfigCreate(
         name=name,
@@ -294,8 +316,8 @@ def create_test_version(
     """
     if config_blob is None:
         config_blob = ConfigBlob(
-            completion=CompletionConfig(
-                provider="openai",
+            completion=NativeCompletionConfig(
+                provider="openai-native",
                 params={
                     "model": "gpt-4",
                     "temperature": 0.8,
diff --git a/deployment.md b/deployment.md
index b75a42137..cdf22fd90 100644
--- a/deployment.md
+++ b/deployment.md
@@ -1,309 +1,236 @@
-# AI Platform - Deployment
+# Kaapi - Deployment
 
-You can deploy the project using Docker Compose to a remote server.
+Kaapi uses a modern cloud-native deployment architecture built on AWS services with automated CI/CD pipelines.
 
-This project expects you to have a Traefik proxy handling communication to the outside world and HTTPS certificates.
+## Deployment Architecture
 
-You can use CI/CD (continuous integration and continuous deployment) systems to deploy automatically, there are already configurations to do it with GitHub Actions.
+### Overview
 
-But you have to configure a couple things first. 🤓
+The deployment follows a containerized approach where:
+- Application code is packaged into Docker images
+- Images are stored in AWS ECR (Elastic Container Registry)
+- ECS (Elastic Container Service) runs and manages the containers
+- GitHub Actions automates the build and deployment process
 
-## Preparation
+### CI/CD Pipeline
 
-* Have a remote server ready and available.
-* Configure the DNS records of your domain to point to the IP of the server you just created.
-* Configure a wildcard subdomain for your domain, so that you can have multiple subdomains for different services, e.g. `*.fastapi-project.example.com`. This will be useful for accessing different components, like `dashboard.fastapi-project.example.com`, `api.fastapi-project.example.com`, `traefik.fastapi-project.example.com`, `adminer.fastapi-project.example.com`, etc. And also for `staging`, like `dashboard.staging.fastapi-project.example.com`, `adminer.staging..fastapi-project.example.com`, etc.
-* Install and configure [Docker](https://docs.docker.com/engine/install/) on the remote server (Docker Engine, not Docker Desktop).
+The deployment pipeline is triggered automatically:
+1. **Code Push**: Developer pushes code to GitHub
+2. **Build**: GitHub Actions builds Docker image
+3. **Push**: Image is pushed to ECR
+4. **Deploy**: ECS pulls new image and updates running tasks
 
-## Public Traefik
+### Environments
 
-We need a Traefik proxy to handle incoming connections and HTTPS certificates.
+Two deployment environments are configured:
+- **Staging**: Deployed on every push to `main` branch for testing
+- **Production**: Deployed only on version tags for stable releases
 
-You need to do these next steps only once.
+## Prerequisites
 
-### Traefik Docker Compose
+Before deploying, ensure the following AWS infrastructure exists:
 
-* Create a remote directory to store your Traefik Docker Compose file:
+### AWS Infrastructure
 
-```bash
-mkdir -p /root/code/traefik-public/
-```
+1. **ECS Clusters**: Separate clusters for staging and production environments
+2. **ECR Repositories**: Container image repositories for each environment
+3. **ECS Task Definitions**: Define container configurations, resource limits, and environment variables
+4. **ECS Services**: Manage the desired number of running tasks
+5. **IAM Role for GitHub**: Allows GitHub Actions to authenticate via OIDC (no long-lived credentials)
+6. **RDS PostgreSQL**: Managed database service (recommended for production)
+7. **ElastiCache Redis**: Managed Redis for caching (optional)
+8. **Amazon MQ RabbitMQ**: Managed message broker for Celery (optional)
 
-Copy the Traefik Docker Compose file to your server. You could do it by running the command `rsync` in your local terminal:
+### GitHub Setup
 
-```bash
-rsync -a docker-compose.traefik.yml root@your-server.example.com:/root/code/traefik-public/
-```
+Configure GitHub to allow automated deployments:
 
-### Traefik Public Network
+1. **Environment**: Create `AWS_ENV_VARS` environment for deployment protection
+2. **Variable**: Set `AWS_RESOURCE_PREFIX` to identify your AWS resources
 
-This Traefik will expect a Docker "public network" named `traefik-public` to communicate with your stack(s).
+## AWS Resource Naming Convention
 
-This way, there will be a single public Traefik proxy that handles the communication (HTTP and HTTPS) with the outside world, and then behind that, you could have one or more stacks with different domains, even if they are on the same single server.
+All AWS resources follow a consistent naming pattern using `AWS_RESOURCE_PREFIX` as the base identifier.
 
-To create a Docker "public network" named `traefik-public` run the following command in your remote server:
+This naming convention ensures clear separation between environments and easy identification of resources.
 
-```bash
-docker network create traefik-public
-```
+## Deployment Workflows
 
-### Traefik Environment Variables
+### Staging Deployment
 
-The Traefik Docker Compose file expects some environment variables to be set in your terminal before starting it. You can do it by running the following commands in your remote server.
+**Purpose**: Automatically deploy changes to a testing environment for validation before production.
 
-* Create the username for HTTP Basic Auth, e.g.:
+**Trigger**: Push to `main` branch
 
 ```bash
-export USERNAME=admin
+git push origin main
 ```
 
-* Create an environment variable with the password for HTTP Basic Auth, e.g.:
+**Workflow Steps** (`.github/workflows/cd-staging.yml`):
+1. **Checkout**: Clone the repository code
+2. **AWS Authentication**: Use OIDC to authenticate (no stored credentials)
+3. **ECR Login**: Authenticate to container registry
+4. **Build**: Create Docker image from `./backend` directory
+5. **Push**: Upload image to staging ECR repository with `latest` tag
+6. **Deploy**: Force ECS to pull and deploy the new image
 
-```bash
-export PASSWORD=changethis
-```
+The deployment typically completes in 5-10 minutes depending on image size and ECS configuration.
 
-* Use openssl to generate the "hashed" version of the password for HTTP Basic Auth and store it in an environment variable:
+### Production Deployment
 
-```bash
-export HASHED_PASSWORD=$(openssl passwd -apr1 $PASSWORD)
-```
-
-To verify that the hashed password is correct, you can print it:
-
-```bash
-echo $HASHED_PASSWORD
-```
+**Purpose**: Deploy stable, tested versions to the production environment.
 
-* Create an environment variable with the domain name for your server, e.g.:
+**Trigger**: Create and push a version tag
 
 ```bash
-export DOMAIN=fastapi-project.example.com
-```
+# Create a version tag
+git tag v1.0.0
 
-* Create an environment variable with the email for Let's Encrypt, e.g.:
-
-```bash
-export EMAIL=admin@example.com
+# Push the tag to trigger deployment
+git push origin v1.0.0
 ```
 
-**Note**: you need to set a different email, an email `@example.com` won't work.
+**Workflow Steps** (`.github/workflows/cd-production.yml`):
+1. **Checkout**: Clone the repository at the tagged version
+2. **AWS Authentication**: Use OIDC to authenticate
+3. **ECR Login**: Authenticate to container registry
+4. **Build**: Create Docker image from `./backend` directory
+5. **Push**: Upload image to production ECR repository with `latest` tag
+6. **Deploy**: Force ECS to pull and deploy the new image
 
-### Start the Traefik Docker Compose
+**Best Practice**: Use semantic versioning (e.g., `v1.0.0`, `v1.2.3`) to clearly identify releases.
 
-Go to the directory where you copied the Traefik Docker Compose file in your remote server:
+## GitHub Configuration
 
-```bash
-cd /root/code/traefik-public/
-```
+### Step 1: Create Environment
 
-Now with the environment variables set and the `docker-compose.traefik.yml` in place, you can start the Traefik Docker Compose running the following command:
+Environments in GitHub provide deployment protection and organization.
 
-```bash
-docker compose -f docker-compose.traefik.yml up -d
-```
+1. Go to repository **Settings → Environments**
+2. Click **New environment**
+3. Name it: `AWS_ENV_VARS`
+4. Optionally, add protection rules (e.g., required reviewers)
 
-## Deploy the FastAPI Project
+### Step 2: Set Repository Variable
 
-Now that you have Traefik in place you can deploy your FastAPI project with Docker Compose.
+Variables store non-sensitive configuration that workflows need.
 
-**Note**: You might want to jump ahead to the section about Continuous Deployment with GitHub Actions.
+1. Go to **Settings → Secrets and variables → Actions → Variables tab**
+2. Click **New repository variable**
+3. Add:
+   - **Name**: `AWS_RESOURCE_PREFIX`
+   - **Value**: Your AWS resource prefix (e.g., `kaapi`)
 
-## Environment Variables
+### Step 3: AWS Authentication Setup
 
-You need to set some environment variables first.
+The workflows use **AWS OIDC authentication**, which is more secure than storing AWS access keys:
+- No long-lived credentials stored in GitHub
+- AWS IAM role assumes identity based on GitHub's OIDC token
+- Permissions are scoped to specific actions
 
-Set the `ENVIRONMENT`, by default `local` (for development), but when deploying to a server you would put something like `staging` or `production`:
+The IAM role ARN is configured in workflow files:
 
-```bash
-export ENVIRONMENT=production
-```
-
-Set the `DOMAIN`, by default `localhost` (for development), but when deploying you would use your own domain, for example:
-
-```bash
-export DOMAIN=fastapi-project.example.com
+```yaml
+role-to-assume: arn:aws:iam::{YOUR_AWS_ACCOUNT_ID}:role/github-action-role
+aws-region: {YOUR_AWS_REGION}
 ```
 
-You can set several variables, like:
+**Note**: Replace `{YOUR_AWS_ACCOUNT_ID}` with your AWS account ID and `{YOUR_AWS_REGION}` with your chosen region (e.g., `ap-south-1`, `us-east-1`).
 
-* `PROJECT_NAME`: The name of the project, used in the API for the docs and emails.
-* `STACK_NAME`: The name of the stack used for Docker Compose labels and project name, this should be different for `staging`, `production`, etc. You could use the same domain replacing dots with dashes, e.g. `fastapi-project-example-com` and `staging-fastapi-project-example-com`.
-* `BACKEND_CORS_ORIGINS`: A list of allowed CORS origins separated by commas.
-* `SECRET_KEY`: The secret key for the FastAPI project, used to sign tokens.
-* `FIRST_SUPERUSER`: The email of the first superuser, this superuser will be the one that can create new users.
-* `FIRST_SUPERUSER_PASSWORD`: The password of the first superuser.
-* `SMTP_HOST`: The SMTP server host to send emails, this would come from your email provider (E.g. Mailgun, Sparkpost, Sendgrid, etc).
-* `SMTP_USER`: The SMTP server user to send emails.
-* `SMTP_PASSWORD`: The SMTP server password to send emails.
-* `EMAILS_FROM_EMAIL`: The email account to send emails from.
-* `POSTGRES_SERVER`: The hostname of the PostgreSQL server. You can leave the default of `db`, provided by the same Docker Compose. You normally wouldn't need to change this unless you are using a third-party provider.
-* `POSTGRES_PORT`: The port of the PostgreSQL server. You can leave the default. You normally wouldn't need to change this unless you are using a third-party provider.
-* `POSTGRES_PASSWORD`: The Postgres password.
-* `POSTGRES_USER`: The Postgres user, you can leave the default.
-* `POSTGRES_DB`: The database name to use for this application. You can leave the default of `app`.
-* `SENTRY_DSN`: The DSN for Sentry, if you are using it.
+## Environment Variables in AWS ECS
 
-## GitHub Actions Environment Variables
+Application configuration is managed through environment variables set in **ECS Task Definitions**. These are injected into containers at runtime.
 
-There are some environment variables only used by GitHub Actions that you can configure:
+### Configuring Environment Variables
 
-* `LATEST_CHANGES`: Used by the GitHub Action [latest-changes](https://github.com/tiangolo/latest-changes) to automatically add release notes based on the PRs merged. It's a personal access token, read the docs for details.
-* `SMOKESHOW_AUTH_KEY`: Used to handle and publish the code coverage using [Smokeshow](https://github.com/samuelcolvin/smokeshow), follow their instructions to create a (free) Smokeshow key.
+Environment variables in ECS Task Definitions include database credentials, AWS credentials, API keys, and service endpoints. Refer to `.env.example` in the repository for a complete list of required and optional variables.
 
-### Generate secret keys
+Key categories:
+- **Authentication & Security**: JWT keys, admin credentials
+- **Database**: PostgreSQL connection details
+- **AWS Services**: S3 access credentials
+- **Background Tasks**: RabbitMQ and Redis endpoints
+- **Optional**: OpenAI API key, Sentry DSN
 
-Some environment variables in the `.env` file have a default value of `changethis`.
+### Generate Secure Keys
 
-You have to change them with a secret key, to generate secret keys you can run the following command:
+Use Python to generate cryptographically secure keys:
 
 ```bash
 python -c "import secrets; print(secrets.token_urlsafe(32))"
 ```
 
-Copy the content and use that as password / secret key. And run that again to generate another secure key.
-
-### Deploy with Docker Compose
-
-With the environment variables in place, you can deploy with Docker Compose:
-
-```bash
-docker compose -f docker-compose.yml up -d
-```
-
-For production you wouldn't want to have the overrides in `docker-compose.override.yml`, that's why we explicitly specify `docker-compose.yml` as the file to use.
-
-## Continuous Deployment (CD)
-
-You can use GitHub Actions to deploy your project automatically. 😎
+Run this multiple times to generate different keys for `SECRET_KEY`, passwords, etc.
 
-You can have multiple environment deployments.
+## Database Migrations
 
-There are already two environments configured, `staging` and `production`. 🚀
+Database schema changes must be applied before deploying new application versions. This ensures the database structure matches what the code expects.
 
-### Install GitHub Actions Runner
+### Using ECS Run Task (Recommended for Production)
 
-* On your remote server, create a user for your GitHub Actions:
+Run migrations as a one-time ECS task:
 
 ```bash
-sudo adduser github
+aws ecs run-task \
+  --cluster {prefix}-cluster \
+  --task-definition {migration-task-def} \
+  --region {YOUR_AWS_REGION}
 ```
 
-* Add Docker permissions to the `github` user:
+This runs the migration in the same environment as your application, ensuring consistency.
 
-```bash
-sudo usermod -aG docker github
-```
+### Local Migration (Development/Testing)
 
-* Temporarily switch to the `github` user:
+For testing migrations locally:
 
 ```bash
-sudo su - github
+cd backend
+uv run alembic upgrade head
 ```
 
-* Go to the `github` user's home directory:
-
-```bash
-cd
-```
+**Important**: Always test migrations in staging before applying to production.
 
-* [Install a GitHub Action self-hosted runner following the official guide](https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/adding-self-hosted-runners#adding-a-self-hosted-runner-to-a-repository).
+## Monitoring & Observability
 
-* When asked about labels, add a label for the environment, e.g. `production`. You can also add labels later.
-
-After installing, the guide would tell you to run a command to start the runner. Nevertheless, it would stop once you terminate that process or if your local connection to your server is lost.
-
-To make sure it runs on startup and continues running, you can install it as a service. To do that, exit the `github` user and go back to the `root` user:
+### AWS CloudWatch
 
+**Logs**: View application logs from ECS tasks
 ```bash
-exit
+aws logs tail /ecs/{cluster}/{service} --follow
 ```
 
-After you do it, you will be on the previous user again. And you will be on the previous directory, belonging to that user.
+**Metrics**: Monitor CPU, memory, request count, error rates
 
-Before being able to go the `github` user directory, you need to become the `root` user (you might already be):
+### ECS Console
 
-```bash
-sudo su
-```
+- View running tasks and their health status
+- Check deployment status and history
+- Monitor service events and errors
 
-* As the `root` user, go to the `actions-runner` directory inside of the `github` user's home directory:
+### Health Checks
 
-```bash
-cd /home/github/actions-runner
-```
+ECS performs health checks on the `/api/v1/utils/health/` endpoint. If this fails, tasks are replaced automatically.
 
-* Install the self-hosted runner as a service with the user `github`:
+## Rollback Procedures
 
-```bash
-./svc.sh install github
-```
+If a deployment introduces issues, rollback to a previous stable version.
 
-* Start the service:
+### List Previous Versions
 
 ```bash
-./svc.sh start
+aws ecs list-task-definitions --family-prefix {prefix}
 ```
 
-* Check the status of the service:
+### Rollback to Previous Version
 
 ```bash
-./svc.sh status
+aws ecs update-service \
+  --cluster {prefix}-cluster \
+  --service {prefix}-service \
+  --task-definition {previous-task-def-arn} \
+  --region {YOUR_AWS_REGION}
 ```
 
-You can read more about it in the official guide: [Configuring the self-hosted runner application as a service](https://docs.github.com/en/actions/hosting-your-own-runners/managing-self-hosted-runners/configuring-the-self-hosted-runner-application-as-a-service).
-
-### Set Secrets
-
-On your repository, configure secrets for the environment variables you need, the same ones described above, including `SECRET_KEY`, etc. Follow the [official GitHub guide for setting repository secrets](https://docs.github.com/en/actions/security-guides/using-secrets-in-github-actions#creating-secrets-for-a-repository).
-
-The current Github Actions workflows expect these secrets:
-
-* `DOMAIN_PRODUCTION`
-* `DOMAIN_STAGING`
-* `STACK_NAME_PRODUCTION`
-* `STACK_NAME_STAGING`
-* `EMAILS_FROM_EMAIL`
-* `FIRST_SUPERUSER`
-* `FIRST_SUPERUSER_PASSWORD`
-* `POSTGRES_PASSWORD`
-* `SECRET_KEY`
-* `LATEST_CHANGES`
-* `SMOKESHOW_AUTH_KEY`
-
-## GitHub Action Deployment Workflows
-
-There are GitHub Action workflows in the `.github/workflows` directory already configured for deploying to the environments (GitHub Actions runners with the labels):
-
-* `staging`: after pushing (or merging) to the branch `master`.
-* `production`: after publishing a release.
-
-If you need to add extra environments you could use those as a starting point.
-
-## URLs
-
-Replace `fastapi-project.example.com` with your domain.
-
-### Main Traefik Dashboard
-
-Traefik UI: `https://traefik.fastapi-project.example.com`
-
-### Production
-
-Frontend: `https://dashboard.fastapi-project.example.com`
-
-Backend API docs: `https://api.fastapi-project.example.com/docs`
-
-Backend API base URL: `https://api.fastapi-project.example.com`
-
-Adminer: `https://adminer.fastapi-project.example.com`
-
-### Staging
-
-Frontend: `https://dashboard.staging.fastapi-project.example.com`
-
-Backend API docs: `https://api.staging.fastapi-project.example.com/docs`
-
-Backend API base URL: `https://api.staging.fastapi-project.example.com`
+ECS will perform a rolling update back to the specified task definition.
 
-Adminer: `https://adminer.staging.fastapi-project.example.com`
+**Tip**: Keep track of stable task definition ARNs for quick rollbacks.
diff --git a/development.md b/development.md
index f927352a3..200508125 100644
--- a/development.md
+++ b/development.md
@@ -1,4 +1,4 @@
-# AI Platform - Development
+# Kaapi - Development
 
 ## Docker Compose
 
@@ -10,15 +10,17 @@ docker compose watch
 
 * Now you can open your browser and interact with these URLs:
 
-Frontend, built with Docker, with routes handled based on the path: http://localhost:5173
+Backend, JSON-based web API based on OpenAPI: <http://localhost:8000>
 
-Backend, JSON based web API based on OpenAPI: http://localhost:8000
+Automatic interactive documentation with Swagger UI (from the OpenAPI backend): <http://localhost:8000/docs>
 
-Automatic interactive documentation with Swagger UI (from the OpenAPI backend): http://localhost:8000/docs
+Alternative interactive documentation (ReDoc): <http://localhost:8000/redoc>
 
-Adminer, database web administration: http://localhost:8080
+Adminer, database web administration: <http://localhost:8080>
 
-Traefik UI, to see how the routes are being handled by the proxy: http://localhost:8090
+RabbitMQ Management UI: <http://localhost:15672>
+
+Celery Flower (task monitoring): <http://localhost:5555>
 
 **Note**: The first time you start your stack, it might take a minute for it to be ready. While the backend waits for the database to be ready and configures everything. You can check the logs to monitor it.
 
@@ -38,75 +40,102 @@ docker compose logs backend
 
 The Docker Compose files are configured so that each of the services is available in a different port in `localhost`.
 
-For the backend and frontend, they use the same port that would be used by their local development server, so, the backend is at `http://localhost:8000` and the frontend at `http://localhost:5173`.
+You can stop the `backend` Docker Compose service and run the local development server instead:
 
-This way, you could turn off a Docker Compose service and start its local development service, and everything would keep working, because it all uses the same ports.
+```bash
+docker compose stop backend
+```
 
-For example, you can stop that `frontend` service in the Docker Compose, in another terminal, run:
+And then start the local development server for the backend:
 
 ```bash
-docker compose stop frontend
+cd backend
+fastapi run --reload app/main.py
 ```
 
-And then start the local frontend development server:
+This way the backend runs on the same port (<http://localhost:8000>) whether it's in Docker or running locally.
+
+### Running Completely Without Docker
+
+If you want to run everything locally without Docker, you'll need to set up RabbitMQ, Redis, Celery, and optionally Celery Flower manually.
+
+#### 1. Install and Start RabbitMQ & Redis
+
+**macOS (using Homebrew):**
 
 ```bash
-cd frontend
-npm run dev
+brew install rabbitmq redis
+
+brew services start rabbitmq
+brew services start redis
 ```
 
-Or you could stop the `backend` Docker Compose service:
+**Ubuntu / Debian:**
 
 ```bash
-docker compose stop backend
+sudo apt update
+sudo apt install rabbitmq-server redis-server
+
+sudo systemctl enable --now rabbitmq-server
+sudo systemctl enable --now redis-server
 ```
 
-And then you can run the local development server for the backend:
+**Verify services are running:**
+
+- RabbitMQ Management UI: <http://localhost:15672> (default credentials: `guest`/`guest`)
+- Redis: `redis-cli ping` (should return `PONG`)
+
+#### 2. Start the Backend Server
+
+In your first terminal:
 
 ```bash
 cd backend
-fastapi dev app/main.py
+fastapi run --reload app/main.py
 ```
 
-## Docker Compose in `localhost.tiangolo.com`
-
-When you start the Docker Compose stack, it uses `localhost` by default, with different ports for each service (backend, frontend, adminer, etc).
+The backend will be available at <http://localhost:8000>
 
-When you deploy it to production (or staging), it will deploy each service in a different subdomain, like `api.example.com` for the backend and `dashboard.example.com` for the frontend.
+#### 3. Start Celery Worker
 
-In the guide about [deployment](deployment.md) you can read about Traefik, the configured proxy. That's the component in charge of transmitting traffic to each service based on the subdomain.
+In a second terminal, start the Celery worker:
 
-If you want to test that it's all working locally, you can edit the local `.env` file, and change:
-
-```dotenv
-DOMAIN=localhost.tiangolo.com
+```bash
+cd backend
+uv run celery -A app.celery.celery_app worker --loglevel=info
 ```
 
-That will be used by the Docker Compose files to configure the base domain for the services.
+Leave this process running. This handles background tasks like document processing and LLM job execution.
 
-Traefik will use this to transmit traffic at `api.localhost.tiangolo.com` to the backend, and traffic at `dashboard.localhost.tiangolo.com` to the frontend.
+#### 4. (Optional) Start Celery Flower for Task Monitoring
 
-The domain `localhost.tiangolo.com` is a special domain that is configured (with all its subdomains) to point to `127.0.0.1`. This way you can use that for your local development.
+Flower provides a web UI to monitor Celery tasks and workers.
 
-After you update it, run again:
+**Start Flower in a fourth terminal:**
 
 ```bash
-docker compose watch
+cd backend
+uv run celery -A app.celery.celery_app flower --port=5555
 ```
 
-When deploying, for example in production, the main Traefik is configured outside of the Docker Compose files. For local development, there's an included Traefik in `docker-compose.override.yml`, just to let you test that the domains work as expected, for example with `api.localhost.tiangolo.com` and `dashboard.localhost.tiangolo.com`.
+Flower will be available at: <http://localhost:5555>
 
-## Docker Compose files and env vars
+> **Note:** If you start Flower before running any Celery workers, you may see warnings like:
+> ```text
+> WARNING - flower.inspector - Inspect method ... failed
+> ```
+> This just means there are no active workers yet. Once you start a Celery worker (step 3),
+> Flower will be able to inspect it and the warnings will stop.
 
-There is a main `docker-compose.yml` file with all the configurations that apply to the whole stack, it is used automatically by `docker compose`.
+---
 
-And there's also a `docker-compose.override.yml` with overrides for development, for example to mount the source code as a volume. It is used automatically by `docker compose` to apply overrides on top of `docker-compose.yml`.
+## Docker Compose files and env vars
 
-These Docker Compose files use the `.env` file containing configurations to be injected as environment variables in the containers.
+The `docker-compose.yml` file contains all the configurations for the stack, including services like PostgreSQL, Redis, RabbitMQ, backend, Celery workers, and Adminer.
 
-They also use some additional configurations taken from environment variables set in the scripts before calling the `docker compose` command.
+The Docker Compose file uses the `.env` file containing configurations to be injected as environment variables in the containers.
 
-After changing variables, make sure you restart the stack:
+After changing environment variables, make sure you restart the stack:
 
 ```bash
 docker compose watch
@@ -166,42 +195,18 @@ eslint...................................................................Passed
 prettier.................................................................Passed
 ```
 
-## URLs
-
-The production or staging URLs would use these same paths, but with your own domain.
-
-### Development URLs
-
-Development URLs, for local development.
-
-Frontend: http://localhost:5173
-
-Backend: http://localhost:8000
-
-Automatic Interactive Docs (Swagger UI): http://localhost:8000/docs
-
-Automatic Alternative Docs (ReDoc): http://localhost:8000/redoc
-
-Adminer: http://localhost:8080
-
-Traefik UI: http://localhost:8090
-
-MailCatcher: http://localhost:1080
-
-### Development URLs with `localhost.tiangolo.com` Configured
-
-Development URLs, for local development.
+## Development URLs
 
-Frontend: http://dashboard.localhost.tiangolo.com
+All services are available on localhost with different ports:
 
-Backend: http://api.localhost.tiangolo.com
+**Backend**: <http://localhost:8000>
 
-Automatic Interactive Docs (Swagger UI): http://api.localhost.tiangolo.com/docs
+**Swagger UI** (Interactive API Docs): <http://localhost:8000/docs>
 
-Automatic Alternative Docs (ReDoc): http://api.localhost.tiangolo.com/redoc
+**ReDoc** (Alternative API Docs): <http://localhost:8000/redoc>
 
-Adminer: http://localhost.tiangolo.com:8080
+**Adminer** (Database Management): <http://localhost:8080>
 
-Traefik UI: http://localhost.tiangolo.com:8090
+**RabbitMQ Management**: <http://localhost:15672> (username: guest, password: guest)
 
-MailCatcher: http://localhost.tiangolo.com:1080
+**Celery Flower** (Task Monitoring): <http://localhost:5555>