You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Just a few questions. By this conformity do you mean another method for getobs,nobs functions treating a tokenized type? Since the tokenizer as the reference describes is essentially splitting text by spaces or simply into individual characters, would this would be used in the context of preprocessing text-based datasets? Would the new module and tests go into /src/methods and /test/methods respectively?
No need to self-assign the issue. Just submit a PR when ready.
By conform, I mean defining a new getobs/nobs on the tokenizer type to call the underlying splitting methods. The functions should go in src/datasets/transformations.jl.
Add simple character and word level tokenizers that conform to the LearnBase.jl
getobs
/nobs
interfaces.The text was updated successfully, but these errors were encountered: