Skip to content

Conversation

@po-nuvai
Copy link

What's in this PR

Hey! This PR tackles a bunch of open issues that have been sitting around.

Bug Fixes

#191 - Instance won't be None
The check was using is None but the function returns InvalidConfig(), not None. Changed it to use isinstance() instead.

#312 - Adjacent neighborhood returns invalid configs
The hill climber was getting neighbor values that could create invalid configs when only one param was changed. Fixed get_param_neighbors() to only return values from neighbors that differ at exactly one index.

#333 - Restrictions not backwards compatible
Old restriction functions expected keyword args but the new code was passing positional args. Added a wrapper to convert them back to kwargs so old code keeps working.

#339 - NVCUDA backend broken with cuda-python 13
cuda-python 13 moved the modules around (cuda.cudacuda.bindings.driver). Added fallback imports so both old and new versions work.

#348 - Lambda restrictions with captured variables fail
When a lambda like lambda p: p['x'] <= limit gets converted to a string, the limit variable is lost. Now we detect closures and keep those lambdas as-is.

New Stuff

#99 - Regression tests
Added a test setup that compares tuning results against known baseline data. Uses the existing vector_add cache file from RTX A4000 as the baseline.

Testing

  • All existing tests pass (213 passed, 51 skipped)
  • New regression tests pass (4 tests)
  • The one failing test (pep440 module) was already broken before these changes

po-nuvai and others added 2 commits January 15, 2026 13:07
Implements Issue KernelTuner#99: regression tests that store tuning data for a
well-known kernel (vector_add) on a well-known GPU (RTX A4000) to
verify Kernel Tuner still measures correctly over time.

Tests include:
- Simulation mode regression test comparing against baseline
- Baseline file integrity validation
- Configuration coverage check
- Timing value sanity checks

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…ernelTuner#339, KernelTuner#348

Bug fixes included:

KernelTuner#191 - Instance won't be None
  - Changed check from `is None` to `isinstance(InvalidConfig)`
  - Fixed in interface.py

KernelTuner#312 - Adjacent neighborhood returns invalid configs
  - Fixed get_param_neighbors() to only return values from neighbors
    that differ ONLY at the specified index
  - Fixed in searchspace.py

KernelTuner#333 - Restrictions not backwards compatible
  - Added wrapper to convert positional args to keyword args
  - Maintains backwards compatibility with old restriction functions
  - Fixed in searchspace.py

KernelTuner#339 - NVCUDA backend not compatible with cuda-python 13
  - Added support for both old (cuda.cuda) and new (cuda.bindings.driver)
    import structures
  - Fixed in nvcuda.py, util.py, observers/nvcuda.py, test_cuda_functions.py

KernelTuner#348 - Restriction lambda with captured variables don't work
  - Added has_closure_variables() check to detect closures
  - Lambdas with closures are kept as-is instead of being converted
    to strings (which loses the closure context)
  - Fixed in util.py

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@sonarqubecloud
Copy link

@benvanwerkhoven
Copy link
Collaborator

Thank your for contribution to Kernel Tuner. While we generally welcome pull requests from new contributors, there are a few issues or difficulties with your pull request.

Unfortunately, this pull request contains fixes for issues that already have been fixed or are currently being worked on by pull requests that are already open. The fact that your pull request ignores, and if merged invalidates, the work that others have or are currently putting into our project is not beneficial for fostering collaboration. Opening one large pull request for unrelated changes is also against the rules laid out in our contribution guide. Significant changes should also be discussed first.

We currently do not have a lot of bandwidth for code development and maintenance, which is also why certain issues have been open for a while. I'm worried that merging this pull request, or even parts of it, only adds more AI generated code for us to maintain. Another concern is that, we all have access to Gen AI coding tools ourselves. If we wanted to have this code in, we would have included it by now.

For example, the regression test code adds a lot of code on a subject that is particularly tricky. For such performance tests, we generally need a controlled environment setup with exclusive access to a GPU, sudo rights to fix the GPU clock frequency to a particular frequency and everything to be sure that we have a controlled environment. This is not something that we want to make part of the default suite of tests to run after every commit, and what exactly the regression test itself is testing for needs to be thought through.

Please first look at the open pull requests and first discuss the changes that you would like to make before created a pull request. This is also mentioned in our contribution guide.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants