Skip to content

Releases: JuliaGPU/Metal.jl

v1.5.0

08 Jan 10:38
dff150f
Compare
Choose a tag to compare

Metal v1.5.0

Diff since v1.4.2

Metal.jl 1.5 is a relatively minor release, which the most important change being behind the scenes: GPUArrays.jl v11 has switched to KernelAbstractions.jl (#461).

There is also one (technically) breaking change: code_agx and @device_code_agx have been removed (#512) because of the heavy Python dependency, and conflicts with PythonCall.jl. This functionality did not support recent M GPUs anyway, so it is unlikely to affect many users.

Features

  • Improve performance of shared storage copies: #445
  • Add an is_m4 function: #498
  • #499

Bug fixes

Merged pull requests:

Closed issues:

  • KernelAbstractions: add Atomix back-end (#218)
  • @device_code_agx errors when Metal Shader Validation is enabled (#463)
  • fill broken after KA integration (#466)
  • Compilation to native code failed: NSError: Undefined symbols (#480)
  • ObjectiveC.Foundation.NSErrorInstance(ObjectiveC.id{ObjectiveC.Foundation.NSError}(0x000000014cb8bd90)) (#487)
  • phi-related IR downgrade issue (#488)
  • Circular dependency when precompiling (#495)
  • Bad interaction between PyCall and Metal (#500)
  • Add github actions CI for linux, windows and non-functional macOS to ensure that precompilation and loading works (#508)

v1.4.2

17 Oct 16:19
Compare
Choose a tag to compare

Metal v1.4.2

Diff since v1.4.1

Merged pull requests:

Closed issues:

  • Relax package requirements (#22)
  • [windows:] Metal does not precompile anymore when installation not functional (#457)
  • [MacOS:] Metal.functional() wrongly returns true despite no GPUs available (#458)

v1.4.1

12 Oct 07:18
Compare
Choose a tag to compare

Metal v1.4.1

Diff since v1.4.0

Merged pull requests:

Closed issues:

  • Don't run benchmarks on the master branch? (#449)
  • unsafe_wrap(Array, ...) of a view does not preserve offset information (#451)
  • Metal does not load any more without error when installation not functional (#453)

v1.4.0

02 Oct 12:31
ff7c7eb
Compare
Choose a tag to compare

Metal v1.4.0

Diff since v1.3.0

Merged pull requests:

Closed issues:

  • Port the opportunistic synchronization from CUDA.jl (#317)
  • Control flow-related miscompilation: (#401)
  • More sporadic 1.11 hangs (#412)
  • Support for LinearAlgebra.kron (#422)
  • Can't use gemm! methods with Metal (#423)
  • Error for thread/group size with different integer types (#424)
  • README example broken (#427)
  • Intermittent load_store_tg test failure (#428)

v1.3.0

23 Aug 11:55
28576b3
Compare
Choose a tag to compare

Metal v1.3.0

Diff since v1.2.0

Merged pull requests:

Closed issues:

  • Audit exports/public symbols (#359)
  • Compilation failure on 1.11 (#370)
  • MTLBinaryArchive (#387)
  • Metal.code_agx() failing in MacOS 15 Beta 3 (#388)
  • Test for min / max broadcasting issue (#389)
  • Type piracy (#396)
  • Potentially unused code in gpuarrays.jl (#397)
  • Shared vs SharedStorage in examples/unified_memory (#405)
  • Unsuported call to an unknown function when calling Distributions (#406)

v1.2.0

08 Jul 11:09
d8ee9d1
Compare
Choose a tag to compare

Metal v1.2.0

Diff since v1.1.0

Merged pull requests:

Closed issues:

  • Tests sporadically timing out on 1.11 (#329)
  • ReshapedArray indexing broken because of Int128 operation (#332)
  • KernelAbstractions copyto! typo (#336)
  • Segmentation Faults (#338)
  • Port accmulate! and findall from CUDA.jl (#348)
  • Tests failing with GPUCompiler v0.26.5 and LLVM v7.1 (#350)
  • downgrades LLVM (#355)
  • sqrt(::Complex) unsupported due to conversion exceptions (#364)

v1.1.0

10 Apr 14:31
1ebc4c9
Compare
Choose a tag to compare

Metal v1.1.0

Diff since v1.0.0

Merged pull requests:

Closed issues:

  • Validation-related back-end crash on macOS Ventura (#34)
  • slow broadcast copy in 2D (#41)
  • Poor performance of mapreduce (#46)
  • Multiplication with SubArrays (#47)
  • Add support to creating MtlArray using a memory allocated by Array (#62)
  • Improve use of unified memory (#86)
  • Use Autoreleasepools with Metal (#103)
  • Unknown RFLT tag generated by macOS 13 Metal compiler (#167)
  • mapreduce allocates a lot on the CPU (#211)
  • Legalization errors with vectorized code (#257)
  • Compilation Failure due to undefined symbols (#276)
  • resize!, append! not defined (#277)
  • tag new version (#278)
  • Panic during profiling tests on 14.4 beta (#281)
  • M3 backend cannot handle atomics with complicated pointer conversions (#282)
  • Int128 does not compile (#287)
  • Two suspicious mtl-related behaviours (#289)
  • LU factorization: add allowsingular keyword argument (#299)
  • Autorelease changes lead to use after free with errors (#301)
  • Reductions don't work on Shared Arrays (#312)

v1.0.0

30 Jan 15:16
f6df13d
Compare
Choose a tag to compare

Metal v1.0.0

Diff since v0.5.1

Merged pull requests:

  • Matrix batches (#158) (@tgymnich)
  • Add 1.10 CI. (#256) (@maleadt)
  • Update manifest (#258) (@github-actions[bot])
  • CompatHelper: bump compat for GPUCompiler to 0.25, (keep existing compat) (#259) (@github-actions[bot])
  • Bump actions/checkout from 3 to 4 (#260) (@dependabot[bot])
  • Update manifest (#261) (@github-actions[bot])
  • CompatHelper: bump compat for CEnum to 0.5, (keep existing compat) (#262) (@github-actions[bot])
  • Update manifest (#263) (@github-actions[bot])
  • CompatHelper: add new compat entry for Artifacts at version 1, (keep existing compat) (#264) (@github-actions[bot])
  • Reduce launch overhead by generating code to encode arguments. (#265) (@maleadt)
  • Remove unused function argument (#266) (@tgymnich)
  • Introduce application tracing profiler (#267) (@maleadt)
  • Remove content(::MTLBuffer), use convert intead. (#268) (@maleadt)
  • Allow more kwargs syntax with kernel launches (#269) (@maleadt)
  • Don't re-use the IO object when shelling out to Python. (#271) (@maleadt)
  • Preserve storage mode when broadcasting. (#273) (@maleadt)

Closed issues:

  • Support for macOS Sonoma (#201)
  • Error with Julia 1.10 (#274)

v0.5.1

13 Sep 14:34
335704e
Compare
Choose a tag to compare

Metal v0.5.1

Diff since v0.5.0

Merged pull requests:

  • MPSMatrix improvements (#157) (@tgymnich)
  • Update manifest (#221) (@github-actions[bot])
  • Update manifest (#222) (@github-actions[bot])
  • Update manifest (#224) (@github-actions[bot])
  • Update manifest (#227) (@github-actions[bot])
  • CompatHelper: bump compat for ObjectiveC to 1, (keep existing compat) (#228) (@github-actions[bot])
  • Update manifest (#230) (@github-actions[bot])
  • Fix argument types in sincos (#232) (@fjebaker)
  • Update manifest (#233) (@github-actions[bot])
  • Improve docs (#235) (@christiangnrd)
  • Remove linear algebra section of MPS docs (#237) (@christiangnrd)
  • CompatHelper: bump compat for GPUCompiler to 0.22, (keep existing compat) (#238) (@github-actions[bot])
  • Port openlibm log1pf as log1p (#239) (@sotlampr)
  • Port openlibm erf (#240) (@tgymnich)
  • Remove 1.6-era override mechanism. (#241) (@maleadt)
  • CompatHelper: add new compat entry for Requires at version 1, (keep existing compat) (#242) (@github-actions[bot])
  • Update manifest (#243) (@github-actions[bot])
  • enable dependabot for GitHub actions (#244) (@ranocha)
  • Bump actions/checkout from 2 to 3 (#245) (@dependabot[bot])
  • Bump peter-evans/create-pull-request from 3 to 5 (#246) (@dependabot[bot])
  • Show METAL_CAPTURE_ENABLED in Metal.versioninfo() when the environment variable is set (#248) (@christiangnrd)
  • Update manifest (#249) (@github-actions[bot])
  • Adapt to GPUCompiler.jl, and other small updates. (#250) (@maleadt)
  • Switch to GPUArrays buffer management. (#251) (@maleadt)
  • Update manifest (#252) (@github-actions[bot])
  • Update manifest (#253) (@github-actions[bot])
  • Bump GPUCompiler (#255) (@maleadt)

Closed issues:

  • Random access indexing into MtlArray views cause scalar indexing (#149)
  • Q: How to debug kernels - KA.@print? (#223)
  • Crash during MTLDispatchListApply (#225)
  • Unable to compile trig functions through ForwardDiff (#229)
  • symbol multiply defined! Bug/crash on Julia master, fine on 1.10 (#231)
  • log1p fails on MtlArray{Float32} (#234)
  • When precompiling, UndefVarError: CompilerConfig not defined (#247)

v0.5.0

01 Jul 15:06
9a72b9c
Compare
Choose a tag to compare

Metal v0.5.0

Diff since v0.4.1

Metal.jl 0.5 is a feature release, bringing initial support for atomic operations (#168).
Low-level atomics that mimic Metal C are supported (atomic_store_explicit,
atomic_load_explicit, etc), as well as a higher-level Metal.@atomic that can be used to
update array values similar to how CUDA.jl's @atomic works. This uses native atomics when
supported, and falls back to a compare-exchange loop otherwise.

Minor changes include an update for the @device_code_agx disassembler, the addition of a
type variable to MtlArray encoding the storage mode (#194), and support for MPSVector
(#199) which should accelerate matrix/vector multiplications.

Also note that Metal.jl now disallows the construction of Float64 arrays, as these are not
support by the Metal libraries.

Closed issues:

  • Support for atomics (#79)
  • Make MtlArray storage mode a type parameter (#190)
  • Long stacktrace when trying to create Float64 rand arrays (#205)
  • allowscalar equivalent for Metal.jl (#206)
  • Define map! ? (#219)

Merged pull requests:

  • Implement atomics using compiler intrinsics (#168) (@maleadt)
  • Parameterize MtlArray storage mode (#194) (@christiangnrd)
  • Implement MPSVector (#199) (@tgymnich)
  • Update manifest (#200) (@github-actions[bot])
  • Add Metal 3.1 to MTLLanguageVersion (#202) (@christiangnrd)
  • Update manifest (#203) (@github-actions[bot])
  • CompatHelper: bump compat for GPUCompiler to 0.21, (keep existing compat) (#204) (@github-actions[bot])
  • Update manifest (#207) (@github-actions[bot])
  • Disallow Float64 arrays entirely. (#209) (@maleadt)
  • Adapt to LLVM.jl 6. (#213) (@maleadt)
  • Update manifest (#215) (@github-actions[bot])
  • Bump disassembler. (#216) (@maleadt)