Skip to content

Commit

Permalink
address review comments
Browse files Browse the repository at this point in the history
  • Loading branch information
philip-essential committed Feb 1, 2025
1 parent 73f6acd commit 66339fa
Show file tree
Hide file tree
Showing 6 changed files with 4 additions and 16 deletions.
6 changes: 2 additions & 4 deletions MaxText/configs/base.yml
Original file line number Diff line number Diff line change
Expand Up @@ -120,6 +120,8 @@ logits_via_embedding: False
normalize_embedding_logits: True # whether to normlize pre-softmax logits if logits_via_embedding is true
logits_dot_in_fp32: False # whether to use fp32 in logits_dense or shared_embedding dot product for stability
cast_logits_to_fp32: True # whether to cast the logits to fp32. The higher precision is generally beneficial, but it can vary slightly.
float32_qk_product: False # in dot_product attention, whether to cast to fp32 the inputs to qk product
float32_logits: False # in dot_product attention, whether to cast to fp32 the inputs to softmax

# mixture of experts (moe)
num_experts: 1
Expand Down Expand Up @@ -192,10 +194,6 @@ final_logits_soft_cap: 0.0
use_post_attn_norm: False
use_post_ffw_norm: False

# In dot_product attention, whether to upcast the qk product and attention logits to fp32
float32_qk_product: False
float32_logits: False


# Combine matmuls for QKV and MLP
fused_qkv: False
Expand Down
4 changes: 1 addition & 3 deletions MaxText/configs/models/gemma-2b.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,4 @@ mlp_activations: ["gelu","linear"]
vocab_size: 256128
decoder_block: "gemma"
normalization_layer_epsilon: 1.e-06
logits_via_embedding: True
float32_qk_product: True
float32_qk_logits: True
logits_via_embedding: True
4 changes: 1 addition & 3 deletions MaxText/configs/models/gemma-7b.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,4 @@ mlp_activations: ["gelu","linear"]
vocab_size: 256128
decoder_block: "gemma"
normalization_layer_epsilon: 1.e-06
logits_via_embedding: True
float32_qk_product: True
float32_qk_logits: True
logits_via_embedding: True
2 changes: 0 additions & 2 deletions MaxText/configs/models/gemma2-27b.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,3 @@ attn_logits_soft_cap: 50.0
sliding_window_size: 4096
use_post_attn_norm: True
use_post_ffw_norm: True
float32_qk_product: True
float32_qk_logits: True
2 changes: 0 additions & 2 deletions MaxText/configs/models/gemma2-2b.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,3 @@ attn_logits_soft_cap: 50.0
sliding_window_size: 4096
use_post_attn_norm: True
use_post_ffw_norm: True
float32_qk_product: True
float32_qk_logits: True
2 changes: 0 additions & 2 deletions MaxText/configs/models/gemma2-9b.yml
Original file line number Diff line number Diff line change
Expand Up @@ -30,5 +30,3 @@ attn_logits_soft_cap: 50.0
sliding_window_size: 4096
use_post_attn_norm: True
use_post_ffw_norm: True
float32_qk_product: True
float32_qk_logits: True

0 comments on commit 66339fa

Please sign in to comment.