v0.20.0
What's Changed
- [docs] Update javadoc link by @xyang16 in #254
- Adds restart feature to management console by @c007456 in #255
- [docs] Update links by @xyang16 in #258
- [Docker][DeepSpeed] use accelerate to reduce memory cost by @lanking520 in #261
- [Docker] bump up torch version by @lanking520 in #264
- [docker] Fix pytorch-cu113 docker file by @frankfliu in #265
- [Docker] support integration test on release images by @lanking520 in #266
- [Docker] make aarch defaults in pytorch by @lanking520 in #267
- Add management console document by @c007456 in #262
- [Docker] fix arg in code by @lanking520 in #268
- [aarch64] Remove unsupported engine by @frankfliu in #269
- [Docker] mkdirs for Inferentia for bug fixing by @lanking520 in #270
- [docker] Fixed pytorch-cu113 docker build by @frankfliu in #271
- [Docker] change copy script location by @lanking520 in #272
- [Docker] upgrade deepspeed and transformers by @lanking520 in #273
- Fix the problem that the returned image of input text cannot be parsed by @c007456 in #276
- update huggingface version by @lanking520 in #277
- [serving] clean cache directory on model unload by @frankfliu in #278
- [python] Fixes logging issue by @frankfliu in #279
- [Docker] switch base image to devel by @lanking520 in #280
- [Docker] Add INT8 support on Large model by @lanking520 in #281
- apply CVE patches into the docker by @lanking520 in #284
- [Docker] minor fix by @lanking520 in #285
- Clean up ModelServerTest with request helper function by @zachgk in #283
- [Docker] add label patching for DLC by @lanking520 in #286
- [serving] Fixes install XGBoost engine bug by @frankfliu in #287
- [serving] Improve logging message by @frankfliu in #289
- [serving] Loads model from root of model_store directory by @frankfliu in #288
- [python] Adds large model inference support with MPI mode by @frankfliu in #291
- [python] Adds built-in DeepSpeed handler by @frankfliu in #292
- [python] Remove DsEngineProvider alias by @frankfliu in #296
- [Docker] add tagging labels by @lanking520 in #297
- [doc] Add some DJL Serving docs by @xyang16 in #299
- Reorganize docs by @zachgk in #298
- [doc] Update DJL Serving doc by @xyang16 in #300
- [Docker] rename parallelformers engine name to transformers by @lanking520 in #301
- [Integration] make 6 min timeout per model by @lanking520 in #304
- [Docker] Fix Accelerate version by @lanking520 in #306
- [serving] Sets default ONNXRuntime OMP threads to 1 by @frankfliu in #303
- start supporting multi-gpu in python mode by @lanking520 in #302
- Create an ensemble workflow by @zachgk in #282
- [Docker][DLC] upgrade for next release by @lanking520 in #307
- [ci] Upgrade deprecated github actions by @frankfliu in #309
- [benchmark] Adds HuggingFace model zoo to djl-bench by @frankfliu in #308
- [doc] Add DJL Serving packaging doc by @xyang16 in #310
- [serving] Adds hugginface tokenizer as default dependency by @frankfliu in #313
- [djl-bench] Update README to upgrade Java version to 11 by @frankfliu in #314
- fix typo in Pymodel mpi log by @siddvenk in #315
- [Docker][DLC] add deepspeed 0.7.5 by @lanking520 in #316
- [DLC] update docker with s5cmd by @lanking520 in #317
- fix shell script by @lanking520 in #319
- fix tar file unzip location by @lanking520 in #320
- [doc] Update serving docs by @xyang16 in #312
- [central] workaround webpack-cli 5.0.0 build issue by @frankfliu in #324
- [benchmark] Make warmup iteration configurable by @frankfliu in #323
- [benchmar] Update benchmark README by @frankfliu in #325
- support download model from s3 by @lanking520 in #322
- [doc] Update configurations document by @frankfliu in #327
- [Docker] build dlc telemetry by @lanking520 in #326
- fix s5cmd by @lanking520 in #328
- Upgrade dependencies version by @frankfliu in #331
- [serving] Uses Engine.getDjlVersion() for consistency by @frankfliu in #330
- [serving] Refactor NeuronUtils by @frankfliu in #333
- add telemetry collection testing by @lanking520 in #332
- [Docker][G5] add test artifacts by @lanking520 in #329
- fix the telemetry inaccessible issues by @lanking520 in #335
- fall back to imds v1 by @lanking520 in #338
- G5 test patch fixes by @lanking520 in #336
- [G5][Docker] add gptj model by @lanking520 in #339
- [Docker] upgrade cu117 by @lanking520 in #340
- [Docker][G5] add bloom 7b1 support by @lanking520 in #341
- update telemetry to follow DLC standard by @lanking520 in #343
- [Docker] add paddlepaddle docker build script by @lanking520 in #342
- fix on the tag by @lanking520 in #344
- [Docker] fix regex by @lanking520 in #345
- final regex fix by @lanking520 in #346
- [HF] add more information to HF Accelerate by @lanking520 in #318
- [ci] Upgrade dependencies version by @frankfliu in #347
- [docs] Update serving configurations document by @frankfliu in #348
- [Docker][G5] add huggingface tests by @lanking520 in #349
- [Handler] fix python grammer by @lanking520 in #350
- [Python] fix potential None on TP degree by @lanking520 in #351
- [serving] Fixes tensor_parallel_degree parsing bug by @frankfliu in #352
- fix telemetry by @lanking520 in #353
- fix mapping issues by @lanking520 in #354
- [G5] final fixes by @lanking520 in #355
- [serving] Support tensor_parallel_degree for commandline by @frankfliu in #356
- use default serve command by @lanking520 in #357
- [Docker] upgrade inferentia docker image by @lanking520 in #337
- [python] Fixes assert warning by @frankfliu in #360
- [Docker] unset min length by @lanking520 in #358
- feat: update deepspeed version to hosted deepspeed wheel by @tosterberg in #361
- [Inf] update model by @lanking520 in #362
- [serving] Avoid using special config.properties for DeepSpeed by @frankfliu in #363
- [serving] Remove unused setLoadOnDevice() by @frankfliu in #364
- fix on the test failure by @lanking520 in #365
- [WIP] Updating deepspeed handler for more models by @siddvenk in #359
- [DeepSpeed] Fix datatype and allow checkpoint loading by @lanking520 in #366
- [DeepSpeed] fix handler usage by @lanking520 in #367
- fix on pipeline by @lanking520 in #368
- [DeepSpeed] fix the get class error by @lanking520 in #369
- [CI] support faster failing by @lanking520 in #370
- Switch order of ds init_inference and pipeline construction to save m… by @siddvenk in #373
- [Docker] fix a few things by @lanking520 in #372
- [Docker] add opt 13b option by @lanking520 in #374
- Update SD handler to work with custom wheel by @siddvenk in #375
- [python] Add dynamic batching feature to python engine by @frankfliu in #371
- [python] Fixes python 3.8 error by @frankfliu in #376
- [python] use jitscript model if available by @frankfliu in #379
- [docker] Adds non-root user to docker by @frankfliu in #378
- Update sd handler to return image in request, fix some default values… by @siddvenk in #377
- [docker] Update PyTorch/TensorFlow version by @frankfliu in #380
- [docker] Fixes aarch64 docker file by @frankfliu in #381
- fix action test for stable-diffusion by @siddvenk in #382
- [python] Adds .npz input support for Python engine by @frankfliu in #383
- [serving] Adds a few more integration tests by @frankfliu in #384
- add status code field by @lanking520 in #385
- remove bracket by @lanking520 in #386
- fix issue in sd handler with data type by @siddvenk in #387
- [ci] Avoid run integration test as root by @frankfliu in #388
- [tests] Reformat python code by @frankfliu in #389
- Fix fp32 issues with DS fork wheel for stable diffusion, fix llm tests by @siddvenk in #390
- Import some dependencies only for stable diffusion test by @siddvenk in #391
- [serving] Make yaml support optional by @frankfliu in #393
- [ci] Rename resent18 model resource url by @frankfliu in #395
- [Backport] use torch wheel that has patched fixes by @lanking520 in #396
New Contributors
- @tosterberg made their first contribution in #361
Full Changelog: v0.19.0...v0.20.0