Releases · gretelai/gretel-synthetics

10 May 12:15

johntmyers

v0.8.0

b3a769e

RTFD!

📖 Module docs now available at https://gretel-synthetics.readthedocs.io

🚧 Minor updates to internals to support better documentation

Assets 2

30 Apr 19:36

zredlined

v0.7.1

8def903

Colaboratory support

📚 Tutorial and doc improvements

Use installed Tensorflow library by default (Colab uses optimized Tensorflow version for TPU)
Optionally, install pinned version of Tensorflow with pip install gretel-synthetics[tf]

Assets 2

30 Apr 17:31

zredlined

v0.7.0

76ac70a

Improvements and Fixes

👍 Improvements

Calculate model perplexity per training epoch (metric for synthetic data set quality)
Added progress bar for SentencePiece tokenizer (can take a while on large datasets)
Cleaned up logging

📚 Tutorial and doc improvements

Automatically save model parameters and training history to model directory
Specify save_all_checkpoints config option to save best, or all checkpoints (save disk space)

Assets 2

21 Apr 00:31

zredlined

v0.6.1

8ce0df4

Improvements and fixes

👍 Improvements

Support CRLF newlines in training datasets
Commas & newlines treated by tokenizer as user defined symbols
Increased default vocabulary size from 200 to 15000. Increases number of successful record validations in most test sets

📚 Tutorial and doc improvements

Specify max_lines in configuration file vs. max_chars (more intuitive)

Assets 2

24 Mar 16:33

zredlined

v0.6.0

15da6d7

Sentencepiece Tokenization

Adding Sentencepiece tokenization (https://github.com/google/sentencepiece) to allow for fixed vocabulary sizes and character / token-based training.

Assets 2

02 Mar 16:26

johntmyers

v0.5.0

1ce0a47

Hello world!

Initial release of Gretel's synthetic data generation project.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: gretelai/gretel-synthetics

RTFD!

Colaboratory support

Improvements and Fixes

Improvements and fixes

Sentencepiece Tokenization

Hello world!