-
-
Notifications
You must be signed in to change notification settings - Fork 211
[ENH] V1 → V2 API Migration - evaluations #1606
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
EmanAbdelhaleem
wants to merge
8
commits into
openml:main
Choose a base branch
from
EmanAbdelhaleem:evaluations-mig
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
8 commits
Select commit
Hold shift + click to select a range
0159f47
set up folder structure and base code
geetu040 58e9175
Merge branch 'main' into migration
fkiraly bdd65ff
Merge branch 'main' into migration
geetu040 52ef379
fix pre-commit
geetu040 f1e7a3c
[ENH] V1 → V2 API Migration - evaluations
EmanAbdelhaleem f0d6889
Merge branch 'main' into pr/1606
fkiraly 92bd292
update list function, remove circular imports
EmanAbdelhaleem e7fd1bc
update list function, remove circular imports
EmanAbdelhaleem File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,8 @@ | ||
| from openml._api.runtime.core import APIContext | ||
|
|
||
|
|
||
| def set_api_version(version: str, *, strict: bool = False) -> None: | ||
| api_context.set_version(version=version, strict=strict) | ||
|
|
||
|
|
||
| api_context = APIContext() |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| from __future__ import annotations | ||
|
|
||
| API_V1_SERVER = "https://www.openml.org/api/v1/xml" | ||
| API_V2_SERVER = "http://127.0.0.1:8001" | ||
| API_KEY = "..." |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| from openml._api.http.client import HTTPClient | ||
|
|
||
| __all__ = ["HTTPClient"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,39 @@ | ||
| from __future__ import annotations | ||
|
|
||
| from typing import Any, Mapping | ||
|
|
||
| import requests | ||
| from requests import Response | ||
|
|
||
| from openml.__version__ import __version__ | ||
|
|
||
|
|
||
| class HTTPClient: | ||
| def __init__(self, base_url: str) -> None: | ||
| self.base_url = base_url | ||
| self.headers: dict[str, str] = {"user-agent": f"openml-python/{__version__}"} | ||
|
|
||
| def get( | ||
| self, | ||
| path: str, | ||
| params: Mapping[str, Any] | None = None, | ||
| ) -> Response: | ||
| url = f"{self.base_url}/{path}" | ||
| return requests.get(url, params=params, headers=self.headers, timeout=10) | ||
|
|
||
| def post( | ||
| self, | ||
| path: str, | ||
| data: Mapping[str, Any] | None = None, | ||
| files: Any = None, | ||
| ) -> Response: | ||
| url = f"{self.base_url}/{path}" | ||
| return requests.post(url, data=data, files=files, headers=self.headers, timeout=10) | ||
|
|
||
| def delete( | ||
| self, | ||
| path: str, | ||
| params: Mapping[str, Any] | None = None, | ||
| ) -> Response: | ||
| url = f"{self.base_url}/{path}" | ||
| return requests.delete(url, params=params, headers=self.headers, timeout=10) |
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| from openml._api.resources.datasets import DatasetsV1, DatasetsV2 | ||
| from openml._api.resources.evaluations import EvaluationsV1, EvaluationsV2 | ||
| from openml._api.resources.tasks import TasksV1, TasksV2 | ||
|
|
||
| __all__ = ["DatasetsV1", "DatasetsV2", "TasksV1", "TasksV2", "EvaluationsV1", "EvaluationsV2"] |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| from __future__ import annotations | ||
|
|
||
| from abc import ABC, abstractmethod | ||
| from typing import TYPE_CHECKING, Any | ||
|
|
||
| if TYPE_CHECKING: | ||
| from requests import Response | ||
|
|
||
| from openml._api.http import HTTPClient | ||
| from openml.datasets.dataset import OpenMLDataset | ||
| from openml.tasks.task import OpenMLTask | ||
|
|
||
|
|
||
| class ResourceAPI: | ||
| def __init__(self, http: HTTPClient): | ||
| self._http = http | ||
|
|
||
|
|
||
| class DatasetsAPI(ResourceAPI, ABC): | ||
| @abstractmethod | ||
| def get(self, dataset_id: int) -> OpenMLDataset | tuple[OpenMLDataset, Response]: ... | ||
|
|
||
|
|
||
| class TasksAPI(ResourceAPI, ABC): | ||
| @abstractmethod | ||
| def get( | ||
| self, | ||
| task_id: int, | ||
| *, | ||
| return_response: bool = False, | ||
| ) -> OpenMLTask | tuple[OpenMLTask, Response]: ... | ||
|
|
||
|
|
||
| class EvaluationsAPI(ResourceAPI, ABC): | ||
| @abstractmethod | ||
| def list( | ||
| self, | ||
| limit: int, | ||
| offset: int, | ||
| function: str, | ||
| **kwargs: Any, | ||
| ) -> dict: ... | ||
|
|
||
| @abstractmethod | ||
| def get_users(self, uploader_ids: list[str]) -> dict: ... | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| from __future__ import annotations | ||
|
|
||
| from typing import TYPE_CHECKING | ||
|
|
||
| from openml._api.resources.base import DatasetsAPI | ||
|
|
||
| if TYPE_CHECKING: | ||
| from responses import Response | ||
|
|
||
| from openml.datasets.dataset import OpenMLDataset | ||
|
|
||
|
|
||
| class DatasetsV1(DatasetsAPI): | ||
| def get(self, dataset_id: int) -> OpenMLDataset | tuple[OpenMLDataset, Response]: | ||
| raise NotImplementedError | ||
|
|
||
|
|
||
| class DatasetsV2(DatasetsAPI): | ||
| def get(self, dataset_id: int) -> OpenMLDataset | tuple[OpenMLDataset, Response]: | ||
| raise NotImplementedError |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,206 @@ | ||
| from __future__ import annotations | ||
|
|
||
| from typing import Any | ||
|
|
||
| import xmltodict | ||
|
|
||
| from openml._api.resources.base import EvaluationsAPI | ||
|
|
||
|
|
||
| class EvaluationsV1(EvaluationsAPI): | ||
| """V1 API implementation for evaluations. | ||
| Fetches evaluations from the v1 XML API endpoint. | ||
| """ | ||
|
|
||
| def list( | ||
| self, | ||
| limit: int, | ||
| offset: int, | ||
| function: str, | ||
| **kwargs: Any, | ||
| ) -> dict: | ||
| """Retrieve evaluations from the OpenML v1 XML API. | ||
|
|
||
| This method builds an evaluation query URL based on the provided | ||
| filters, sends a request to the OpenML v1 endpoint, parses the XML | ||
| response into a dictionary, and enriches the result with uploader | ||
| usernames. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| limit : int | ||
| Maximum number of evaluations to return. | ||
| offset : int | ||
| Offset for pagination. | ||
| function : str | ||
| the evaluation function. e.g., predictive_accuracy | ||
| **kwargs | ||
| Optional filters supported by the OpenML evaluation API, such as: | ||
| - tasks | ||
| - setups | ||
| - flows | ||
| - runs | ||
| - uploaders | ||
| - tag | ||
| - study | ||
| - sort_order | ||
|
|
||
| Returns | ||
| ------- | ||
| dict | ||
| A dictionary containing: | ||
| - Parsed evaluation data from the XML response | ||
| - A "users" key mapping uploader IDs to usernames | ||
|
|
||
| Raises | ||
| ------ | ||
| ValueError | ||
| If the XML response does not contain the expected structure. | ||
| AssertionError | ||
| If the evaluation data is not in list format as expected. | ||
|
|
||
| Notes | ||
| ----- | ||
| This method performs two API calls: | ||
| 1. Fetches evaluation data from the specified endpoint | ||
| 2. Fetches user information for all uploaders in the evaluation data | ||
|
|
||
| The user information is used to map uploader IDs to usernames. | ||
| """ | ||
| api_call = self._build_url(limit, offset, function, **kwargs) | ||
| eval_response = self._http.get(api_call) | ||
| xml_content = eval_response.text | ||
|
|
||
| evals_dict: dict[str, Any] = xmltodict.parse(xml_content, force_list=("oml:evaluation",)) | ||
| # Minimalistic check if the XML is useful | ||
| if "oml:evaluations" not in evals_dict: | ||
| raise ValueError( | ||
| "Error in return XML, does not contain " f'"oml:evaluations": {evals_dict!s}', | ||
| ) | ||
|
|
||
| assert isinstance(evals_dict["oml:evaluations"]["oml:evaluation"], list), ( | ||
| "Expected 'oml:evaluation' to be a list, but got " | ||
| f"{type(evals_dict['oml:evaluations']['oml:evaluation']).__name__}. " | ||
| ) | ||
|
|
||
| uploader_ids = list( | ||
| {eval_["oml:uploader"] for eval_ in evals_dict["oml:evaluations"]["oml:evaluation"]}, | ||
| ) | ||
| user_dict = self.get_users(uploader_ids) | ||
| evals_dict["users"] = user_dict | ||
|
|
||
| return evals_dict | ||
|
|
||
| def get_users(self, uploader_ids: list[str]) -> dict: | ||
| """ | ||
| Retrieve usernames for a list of OpenML user IDs. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| uploader_ids : list[str] | ||
| List of OpenML user IDs. | ||
|
|
||
| Returns | ||
| ------- | ||
| dict | ||
| A mapping from user ID (str) to username (str). | ||
| """ | ||
| api_users = "user/list/user_id/" + ",".join(uploader_ids) | ||
| user_response = self._http.get(api_users) | ||
| xml_content_user = user_response.text | ||
|
|
||
| users = xmltodict.parse(xml_content_user, force_list=("oml:user",)) | ||
| return {user["oml:id"]: user["oml:username"] for user in users["oml:users"]["oml:user"]} | ||
|
|
||
| def _build_url( | ||
| self, | ||
| limit: int, | ||
| offset: int, | ||
| function: str, | ||
| **kwargs: Any, | ||
| ) -> str: | ||
| """ | ||
| Construct an OpenML evaluation API URL with filtering parameters. | ||
|
|
||
| Parameters | ||
| ---------- | ||
| limit : int | ||
| Maximum number of evaluations to return. | ||
| offset : int | ||
| Offset for pagination. | ||
| function : str | ||
| the evaluation function. e.g., predictive_accuracy | ||
| **kwargs | ||
| Evaluation filters such as task IDs, flow IDs, | ||
| uploader IDs, study name, and sorting options. | ||
|
|
||
| Returns | ||
| ------- | ||
| str | ||
| A relative API path suitable for an OpenML HTTP request. | ||
| """ | ||
| api_call = f"evaluation/list/function/{function}" | ||
| if limit is not None: | ||
| api_call += f"/limit/{limit}" | ||
| if offset is not None: | ||
| api_call += f"/offset/{offset}" | ||
|
|
||
| # List-based filters | ||
| list_filters = { | ||
| "task": kwargs.get("tasks"), | ||
| "setup": kwargs.get("setups"), | ||
| "flow": kwargs.get("flows"), | ||
| "run": kwargs.get("runs"), | ||
| "uploader": kwargs.get("uploaders"), | ||
| } | ||
|
|
||
| for name, values in list_filters.items(): | ||
| if values is not None: | ||
| api_call += f"/{name}/" + ",".join(str(int(v)) for v in values) | ||
|
|
||
| # Single-value filters | ||
| if kwargs.get("study") is not None: | ||
| api_call += f"/study/{kwargs['study']}" | ||
|
|
||
| if kwargs.get("sort_order") is not None: | ||
| api_call += f"/sort_order/{kwargs['sort_order']}" | ||
|
|
||
| # Extra filters (tag, per_fold, future-proof) | ||
| for key in ("tag", "per_fold"): | ||
| value = kwargs.get(key) | ||
| if value is not None: | ||
| api_call += f"/{key}/{value}" | ||
|
|
||
| return api_call | ||
|
|
||
|
|
||
| class EvaluationsV2(EvaluationsAPI): | ||
| """V2 API implementation for evaluations. | ||
| Fetches evaluations from the v2 json API endpoint. | ||
| """ | ||
|
|
||
| def list( | ||
| self, | ||
| limit: int, | ||
| offset: int, | ||
| function: str, | ||
| **kwargs: Any, | ||
| ) -> dict: | ||
| """ | ||
| Retrieve evaluation results from the OpenML v2 JSON API. | ||
|
|
||
| Notes | ||
| ----- | ||
| This method is not yet implemented. | ||
| """ | ||
| raise NotImplementedError("V2 API implementation is not yet available") | ||
|
|
||
| def get_users(self, uploader_ids: list[str]) -> dict: | ||
| """ | ||
| Retrieve usernames for a list of OpenML user IDs using the v2 API. | ||
|
|
||
| Notes | ||
| ----- | ||
| This method is not yet implemented. | ||
| """ | ||
| raise NotImplementedError("V2 API implementation is not yet available") |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am a bit skeptical with this
Evaluationsapi_contextor rely to a utility function inusers/functions?userused elsewhere in code?