hishab
/

titulm-mpt-1b-v2.0

Text Generation

text-generation-inference

Model card Files Files and versions

sagorsarker commited on Apr 3, 2024

Commit

e83a1c6

·

verified ·

1 Parent(s): 00df691

Update README.md

Files changed (1) hide show

README.md +3 -2

README.md CHANGED Viewed

@@ -13,8 +13,8 @@ pipeline_tag: text-generation
 # TituLM-1B-ENBN-V1
 TituLM-1B-ENBN-V1 is a large language model specifically trained for generating and understanding English and Bangla text. Utilizing a decoder-style transformer architecture, this model has been extensively trained on a dataset comprising __(will disclose later)__ billion Bangla and English tokens. This model is the part of iterative train and release Bilingual LLM from Hishab.
-## Training
-The training process was managed using the robust framework provided by MosaicML's llm-foundry repository. Throughout the training phase, titulm-1b-bn-v1 underwent a total of 42 iterations, allowing for iterative refinements and optimization. Notable training configs:
 - n_nead: 16
 - n_layers: 24
@@ -23,6 +23,7 @@ The training process was managed using the robust framework provided by MosaicML
 - attn_impl: flash
 - Trained on 8 H100 GPU on GCP
 ## Datasets

 # TituLM-1B-ENBN-V1
 TituLM-1B-ENBN-V1 is a large language model specifically trained for generating and understanding English and Bangla text. Utilizing a decoder-style transformer architecture, this model has been extensively trained on a dataset comprising __(will disclose later)__ billion Bangla and English tokens. This model is the part of iterative train and release Bilingual LLM from Hishab.
+The training process was managed using the robust framework provided by MosaicML's [llm-foundry](https://github.com/mosaicml/llm-foundry) repository. Throughout the training phase, titulm-1b-bn-v1 underwent a total of 59 iterations, allowing for iterative refinements and optimization.
+Notable training configs:
 - n_nead: 16
 - n_layers: 24
 - attn_impl: flash
 - Trained on 8 H100 GPU on GCP
 ## Datasets