Update config.json

This PR addresses a bug that would prevent flash attention 2 from running with granite-speech-8b using HF transformers. The same bug was not present for the 2b version.

Upon closer inspection the line " "_attn_implementation_autoset": true "was not present in config.json (but was present in the 2b version). After adding this line FA2 appears to be functional again.

Files changed (1) hide show

config.json +1 -0

config.json CHANGED Viewed

@@ -23,6 +23,7 @@
     "initializer_range": 0.02,
     "model_type": "granite_speech",
     "projector_config": {
         "attention_probs_dropout_prob": 0.1,
         "cross_attention_frequency": 1,
         "encoder_hidden_size": 1024,

     "initializer_range": 0.02,
     "model_type": "granite_speech",
     "projector_config": {
+        "_attn_implementation_autoset": true
         "attention_probs_dropout_prob": 0.1,
         "cross_attention_frequency": 1,
         "encoder_hidden_size": 1024,