Skip to content

Commit

Permalink
Add release notes
Browse files Browse the repository at this point in the history
  • Loading branch information
pemistahl committed Oct 29, 2024
1 parent b8774f6 commit 6f4bc15
Show file tree
Hide file tree
Showing 2 changed files with 30 additions and 2 deletions.
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -62,14 +62,16 @@ Because of that, the language models were then stored in NumPy arrays instead of
dictionaries. Memory consumption reduced to approximately 800 MB but CPU
performance dropped significantly. Both approaches were not satisfying.

Starting from version 2.0.0, the pure Python implementation was replaced with
Starting from version 2.0.0, the pure Python implementation is complemented by
compiled Python bindings to the native
[Rust implementation](https://github.com/pemistahl/lingua-rs) of *Lingua*.
This decision has led to both quick performance and a small memory
footprint of less than 1 GB. The pure Python implementation is still available
in a [separate branch](https://github.com/pemistahl/lingua-py/tree/pure-python-impl)
in this repository and will be kept up-to-date in subsequent 1.* releases.
Both 1.* and 2.* versions will remain available on the Python package index (PyPI).
There are environments that do not support native Python extensions such as
[Juno](https://juno.sh/), so a pure Python implementation is still useful.
Both 1.* and 2.* versions are available on the Python package index (PyPI).

## 4. Which languages are supported?

Expand Down
26 changes: 26 additions & 0 deletions RELEASE_NOTES.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,29 @@
## Lingua 1.4.0 (released on 29 Oct 2024)

### Features

- This release introduces an absolute confidence metric based on unique and most
common ngrams for each supported language. It allows to build
a language detector from a single language only. Such a detector serves as
a binary classifier, telling you whether some text is written in your selected
language or not. (#235)

### Improvements

- The new absolute confidence metric helps to improve accuracy in low accuracy mode.
The mean of average detection accuracy (single words, word pairs and sentences combined)
increases from 77% to 80%.

### Bug Fixes

- The tokenization of texts written in the Devanagari alphabet was flawed.
This has been fixed, leading to better detection accuracy for Hindi and Marathi.

### Compatibility

- The newest Python 3.13 is now officially supported.
- Support for Python 3.8 and 3.9 has been dropped. The lowest supported Python version is 3.10 now.

## Lingua 1.3.5 (released on 03 Apr 2024)

### Improvements
Expand Down

0 comments on commit 6f4bc15

Please sign in to comment.