Text Generation
Transformers
Safetensors
English
ddllama
conversational
custom_code
xuan-luo commited on
Commit
f63473a
·
verified ·
1 Parent(s): 9c0e85c

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +0 -2
README.md CHANGED
@@ -15,8 +15,6 @@ library_name: transformers
15
 
16
  The implementation of the paper Differential Layer Skipping in Large Language Models.
17
 
18
- ## Model Details
19
-
20
  ### Model Description
21
 
22
  DiffSkip-Llama-3-8B-Instruct is an enhanced version of the Llama-3-8B-Instruct model, incorporating the Differential Layer Skipping (DiffSkip) method to enable dynamic Feed-Forward Network (FFN) skipping during text generation. This approach leverages the self-attention input-output difference as a routing signal, allowing tokens to bypass FFN blocks based on computational needs.
 
15
 
16
  The implementation of the paper Differential Layer Skipping in Large Language Models.
17
 
 
 
18
  ### Model Description
19
 
20
  DiffSkip-Llama-3-8B-Instruct is an enhanced version of the Llama-3-8B-Instruct model, incorporating the Differential Layer Skipping (DiffSkip) method to enable dynamic Feed-Forward Network (FFN) skipping during text generation. This approach leverages the self-attention input-output difference as a routing signal, allowing tokens to bypass FFN blocks based on computational needs.