Sorting out issue with calling predict on matrix #219

pat-alt · 2023-03-30T13:03:29Z

Closes #218

src/core.jl

ablaom

@pat-alt Thanks for looking at this.

Yes, I think your proposal ought to work: matrices can be provided not only to predict, but also to fit, which we should have for consistency.

What's left to do is:

Add tests
Update the scitype declarations
Update the docstrings

For example, this code should be changed to

MLJModelInterface.metadata_model(NeuralNetworkClassifier,
                                 input=Union{AbstractMatrix{Continuous}, Table(Continuous)},
                                 target=AbstractVector{<:Finite},
                                 path="MLJFlux.NeuralNetworkClassifier")

Similar changes can be made at regressor.jl

pat-alt · 2023-04-05T09:01:52Z

I've updated the scitype declarations for regression (here and here) and for classification (here). For testing, I thought I'd just rerun basic tests with a matrix input (see here). Unfortunately, I can't immediately make sense of the error I'm getting.

ablaom · 2023-04-05T22:32:01Z

Thanks for persisting with this.

This error usually means we're trying to grab the schema of an object that doesn't have one. I believe the culprit is here. We could add a second method dispatching on matrices to address. There are similar issues for shape in regressor.jl.

pat-alt · 2023-04-07T08:13:26Z

Thanks! I believe that's fixed now, though I did still get some warnings like this one:

Warning: The number and/or types of data arguments do not match what the specified model
│ supports. Suppress this type check by specifying `scitype_check_level=0`.
│ 
│ Run `@doc MLJFlux.NeuralNetworkRegressor` to learn more about your model's requirements.
│ 
│ Commonly, but non exclusively, supervised models are constructed using the syntax
│ `machine(model, X, y)` or `machine(model, X, y, w)` while most other models are
│ constructed with `machine(model, X)`.  Here `X` are features, `y` a target, and `w`
│ sample or class weights.
│ 
│ In general, data in `machine(model, data...)` is expected to satisfy
│ 
│     scitype(data) <: MLJ.fit_data_scitype(model)
│ 
│ In the present case:
│ 
│ scitype(data) = Tuple{Table{AbstractVector{Continuous}}, AbstractVector{Continuous}}
│ 
│ fit_data_scitype(model) = Tuple{Union{Table{<:AbstractVector{<:Continuous}}, AbstractMatrix{Continuous}}, AbstractVector{<:Finite}}

Edit: As for docstrings, I've updated for the classifier and regressors as follows:

- `X` is either a `Matrix` or any table of input features (eg, a `DataFrame`) whose columns are of scitype
  `Continuous`; check column scitypes with `schema(X)`. If `X` is a `Matrix`, it is assumed to have columns corresponding to features and rows corresponding to observations.

ablaom · 2023-04-10T21:22:19Z

Thanks! I believe that's fixed now, though I did still get some warnings like this one:

~~According to the warning you are providing the regressor with a Continuous target, rather than a Finite one (categorical vector).~~

ablaom · 2023-04-11T21:05:36Z

Sorry, rather the scitype declaration for the regressor is wrong. You need Continuous in the target_scitype where you appear to have Finite.

codecov-commenter · 2023-04-12T05:14:54Z

Codecov Report

Patch coverage: 100.00% and project coverage change: +0.14 🎉

Comparison is base (452c09d) 92.73% compared to head (49d11af) 92.88%.

📣 This organization is not using Codecov’s GitHub App Integration. We recommend you install it so Codecov can continue to function properly for your repositories. Learn more

Additional details and impacted files

@@            Coverage Diff             @@
##              dev     #219      +/-   ##
==========================================
+ Coverage   92.73%   92.88%   +0.14%     
==========================================
  Files          11       11              
  Lines         303      309       +6     
==========================================
+ Hits          281      287       +6     
  Misses         22       22

Impacted Files	Coverage Δ
src/classifier.jl	`100.00% <100.00%> (ø)`
src/core.jl	`94.87% <100.00%> (+0.13%)`	⬆️
src/regressor.jl	`100.00% <100.00%> (ø)`
src/types.jl	`100.00% <100.00%> (ø)`

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

pat-alt · 2023-04-12T05:27:30Z

Sorry, rather the scitype declaration for the regressor is wrong. You need Continuous in the target_scitype where you appear to have Finite.

You're right, thanks. Fixed that now, but I still get some broken tests (e.g. here. Is this expected?

ablaom · 2023-04-16T21:09:33Z

Fixed that now, but I still get some broken tests (e.g. here. Is this expected?

Yes. Some of the broken tests are because you are looking at CPU tests, where some tests are excluded and get marked as broken.

The remaining broken tests are explained here #87 .

So no further action on your part needed here. I will finish my review shortly, thanks.

ablaom

I'm happy with this as is. Many thanks.

However, perhaps you would consider one additional enhancement, for consistency:
The MulitargetNeuralNetworkRegressor accepts the target as a table, but not as a matrix. It seems this would not be difficult to fix (but I haven't checked):

In fitresult(model::MultitargetNeuralNetworkRegressor, chain, y) method return nothing instead of Table.schema(y).names if y is a matrix.
In shape make the change for treating y you already have made for X.
In predict return a matrix if target_column_names is nothing.

If you'd rather proceed as we are, then I'll just open an new issue and reference this comment.

pat-alt · 2023-04-18T07:30:53Z

Thanks, good idea! I've made the necessary adjustments and updated tests accordingly. Had to make one additional small modification to the collate function. I've also updated the docstring.

ablaom · 2023-04-19T19:55:54Z

src/core.jl

@@ -224,6 +224,7 @@ by `model.batch_size`.)

 """
 function collate(model, X, y)
+    y = y isa Matrix ? Tables.table(y) : y


I'm trying to understand why this fix was needed. The nrows function is supposed to work for matrices as well as tables:

using MLJBase julia> y = rand(2, 6) julia> nrows(y) 2 julia> nrows(y') 6

And line 230 below should already take care of the conversion of y to a matrix, no?

This was giving me an error previously, because the nrows method defined in the same file expects a table (or else it throws an ArgumentError. I've adjusted that and tests are passing.

src/regressor.jl

Co-authored-by: Anthony Blaom, PhD <[email protected]>

ablaom · 2023-04-24T04:42:25Z

src/core.jl

@@ -141,6 +141,7 @@ function nrows(X)
    return length(cols[1])
 end
 nrows(y::AbstractVector) = length(y)
+nrows(X::AbstractMatrix) = size(X, 1)



Oh, I see. We're not using nrows from MLJBase.jl - I forgot. Thanks for humouring me with the explanation!

ablaom

Looks good to go.

pat-alt added 2 commits March 30, 2023 15:02

removed changes related to other PR and added reformat function

bac03b7

fit! method through an error because matrix wasn't transposed

08316cb

ablaom reviewed Apr 4, 2023

View reviewed changes

src/core.jl Outdated Show resolved Hide resolved

ablaom reviewed Apr 4, 2023

View reviewed changes

pat-alt added 2 commits April 5, 2023 10:14

first go at completing remaining tasks.

221bc58

currently a little clueless about this error

3c36915

tests now passing

e98182a

updated docstrings

0ca1d43

fixed scitype warning

d01f536

ablaom approved these changes Apr 16, 2023

View reviewed changes

pat-alt and others added 2 commits April 17, 2023 16:33

test is failing

d3942ec

tests passing

f5c6b50

ablaom reviewed Apr 19, 2023

View reviewed changes

src/regressor.jl Outdated Show resolved Hide resolved

pat-alt and others added 4 commits April 22, 2023 11:20

Update src/regressor.jl

aa5c6ab

Co-authored-by: Anthony Blaom, PhD <[email protected]>

collate function

56eec63

tests passing

cefd9f3

removed a redundant print statement

49d11af

ablaom reviewed Apr 24, 2023

View reviewed changes

ablaom approved these changes Apr 24, 2023

View reviewed changes

ablaom merged commit 548ea65 into FluxML:dev Apr 24, 2023

This was referenced Apr 24, 2023

For a 0.2.10 release #223

Merged

Fix target scitype for multitarget regressor #224

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sorting out issue with calling predict on matrix #219

Sorting out issue with calling predict on matrix #219

pat-alt commented Mar 30, 2023

ablaom left a comment •

edited

Loading

pat-alt commented Apr 5, 2023

ablaom commented Apr 5, 2023

pat-alt commented Apr 7, 2023 •

edited

Loading

ablaom commented Apr 10, 2023 •

edited

Loading

ablaom commented Apr 11, 2023

codecov-commenter commented Apr 12, 2023 •

edited

Loading

pat-alt commented Apr 12, 2023

ablaom commented Apr 16, 2023

ablaom left a comment •

edited

Loading

pat-alt commented Apr 18, 2023

ablaom Apr 19, 2023

pat-alt Apr 22, 2023

ablaom Apr 24, 2023

ablaom left a comment

Sorting out issue with calling predict on matrix #219

Sorting out issue with calling predict on matrix #219

Conversation

pat-alt commented Mar 30, 2023

ablaom left a comment • edited Loading

Choose a reason for hiding this comment

pat-alt commented Apr 5, 2023

ablaom commented Apr 5, 2023

pat-alt commented Apr 7, 2023 • edited Loading

ablaom commented Apr 10, 2023 • edited Loading

ablaom commented Apr 11, 2023

codecov-commenter commented Apr 12, 2023 • edited Loading

Codecov Report

pat-alt commented Apr 12, 2023

ablaom commented Apr 16, 2023

ablaom left a comment • edited Loading

Choose a reason for hiding this comment

pat-alt commented Apr 18, 2023

ablaom Apr 19, 2023

Choose a reason for hiding this comment

pat-alt Apr 22, 2023

Choose a reason for hiding this comment

ablaom Apr 24, 2023

Choose a reason for hiding this comment

ablaom left a comment

Choose a reason for hiding this comment

ablaom left a comment •

edited

Loading

pat-alt commented Apr 7, 2023 •

edited

Loading

ablaom commented Apr 10, 2023 •

edited

Loading

codecov-commenter commented Apr 12, 2023 •

edited

Loading

ablaom left a comment •

edited

Loading