sagorsarker commited on
Commit
c2604aa
·
verified ·
1 Parent(s): c905252

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +22 -0
README.md CHANGED
@@ -1,3 +1,25 @@
1
  ---
2
  license: apache-2.0
 
 
 
 
 
 
 
 
3
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
  license: apache-2.0
3
+ datasets:
4
+ - togethercomputer/RedPajama-Data-V2
5
+ - uonlp/CulturaX
6
+ - wikipedia
7
+ language:
8
+ - en
9
+ - bn
10
+ pipeline_tag: text-generation
11
  ---
12
+
13
+ # TituLM-1B-ENBN-V1
14
+ TituLM-1B-ENBN-V1 is a large language model specifically trained for generating and understanding English and Bangla text. Utilizing a decoder-style transformer architecture, this model has been extensively trained on a dataset comprising __(will disclose later)__ billion Bangla tokens. This model is the part of iterative train and release Bilingual LLM from Hishab.
15
+
16
+ ## Training
17
+ The training process was managed using the robust framework provided by MosaicML's llm-foundry repository. Throughout the training phase, titulm-1b-bn-v1 underwent a total of 42 iterations, allowing for iterative refinements and optimization. Notable training configs:
18
+
19
+ - n_nead: 16
20
+ - n_layers: 24
21
+ - max_sequence_length: 2048
22
+ - vocab_size: 72000
23
+ - attn_impl: flash
24
+ - Trained on 8 H100 GPU on GCP
25
+