sagorsarker commited on
Commit
e83a1c6
·
verified ·
1 Parent(s): 00df691

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -2
README.md CHANGED
@@ -13,8 +13,8 @@ pipeline_tag: text-generation
13
  # TituLM-1B-ENBN-V1
14
  TituLM-1B-ENBN-V1 is a large language model specifically trained for generating and understanding English and Bangla text. Utilizing a decoder-style transformer architecture, this model has been extensively trained on a dataset comprising __(will disclose later)__ billion Bangla and English tokens. This model is the part of iterative train and release Bilingual LLM from Hishab.
15
 
16
- ## Training
17
- The training process was managed using the robust framework provided by MosaicML's llm-foundry repository. Throughout the training phase, titulm-1b-bn-v1 underwent a total of 42 iterations, allowing for iterative refinements and optimization. Notable training configs:
18
 
19
  - n_nead: 16
20
  - n_layers: 24
@@ -23,6 +23,7 @@ The training process was managed using the robust framework provided by MosaicML
23
  - attn_impl: flash
24
  - Trained on 8 H100 GPU on GCP
25
 
 
26
  ## Datasets
27
 
28
 
 
13
  # TituLM-1B-ENBN-V1
14
  TituLM-1B-ENBN-V1 is a large language model specifically trained for generating and understanding English and Bangla text. Utilizing a decoder-style transformer architecture, this model has been extensively trained on a dataset comprising __(will disclose later)__ billion Bangla and English tokens. This model is the part of iterative train and release Bilingual LLM from Hishab.
15
 
16
+ The training process was managed using the robust framework provided by MosaicML's [llm-foundry](https://github.com/mosaicml/llm-foundry) repository. Throughout the training phase, titulm-1b-bn-v1 underwent a total of 59 iterations, allowing for iterative refinements and optimization.
17
+ Notable training configs:
18
 
19
  - n_nead: 16
20
  - n_layers: 24
 
23
  - attn_impl: flash
24
  - Trained on 8 H100 GPU on GCP
25
 
26
+
27
  ## Datasets
28
 
29