|
--- |
|
license: apache-2.0 |
|
language: |
|
- en |
|
base_model: |
|
- mistralai/Mistral-7B-Instruct-v0.3 |
|
pipeline_tag: question-answering |
|
--- |
|
|
|
# Mistral-7B-Instruct-v0.3-EOSC |
|
|
|
Federated fine tuned version using data from the EOSC registry. |
|
|
|
Federated training configuration: |
|
|
|
- model.name = "mistralai/Mistral-7B-Instruct-v0.3" |
|
- model.quantization = 4 |
|
- model.gradient-checkpointing = true |
|
- model.lora.peft-lora-r = 32 |
|
- model.lora.peft-lora-alpha = 64 |
|
- train.save-every-round = 5 |
|
- train.learning-rate-max = 5e-5 |
|
- train.learning-rate-min = 1e-6 |
|
- train.seq-length = 512 |
|
- train.training-arguments.per-device-train-batch-size = 16 |
|
- train.training-arguments.gradient-accumulation-steps = 1 |
|
- train.training-arguments.logging-steps = 10 |
|
- train.training-arguments.num-train-epochs = 2 |
|
- train.training-arguments.max-steps = 10 |
|
- train.training-arguments.save-steps = 1000 |
|
- train.training-arguments.save-total-limit = 10 |
|
- train.training-arguments.gradient-checkpointing = true |
|
- train.training-arguments.lr-scheduler-type = "constant" |
|
- strategy.fraction-fit = 0.1 |
|
- strategy.fraction-evaluate = 0.0 |
|
- num-server-rounds = 10 |
|
|
|
The PEFT presented in this model corresponds to 5 rounds of the FL training, |
|
|
|
The following `bitsandbytes` quantization config was used during training: |
|
- quant_method: QuantizationMethod.BITS_AND_BYTES |
|
- _load_in_8bit: False |
|
- _load_in_4bit: True |
|
- llm_int8_threshold: 6.0 |
|
- llm_int8_skip_modules: None |
|
- llm_int8_enable_fp32_cpu_offload: False |
|
- llm_int8_has_fp16_weight: False |
|
- bnb_4bit_quant_type: fp4 |
|
- bnb_4bit_use_double_quant: False |
|
- bnb_4bit_compute_dtype: float32 |
|
- bnb_4bit_quant_storage: uint8 |
|
- load_in_4bit: True |
|
- load_in_8bit: False |
|
|
|
### Framework versions |
|
|
|
|
|
- PEFT 0.6.2 |
|
|
|
|
|
### Try the model! |
|
|
|
```python |
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
from peft import PeftModel |
|
import torch |
|
|
|
base_model = "mistralai/Mistral-7B-Instruct-v0.3" |
|
adapter_model = 'ifca-advanced-computing/Mistral-7B-Instruct-v0.3-EOSC' |
|
|
|
model = AutoModelForCausalLM.from_pretrained(base_model) |
|
model = PeftModel.from_pretrained(model, adapter_model) |
|
tokenizer = AutoTokenizer.from_pretrained(base_model) |
|
|
|
model.eval() |
|
|
|
query = [ |
|
{"role": "user", "content": "What is the EOSC?"}, |
|
] |
|
|
|
input_ids = tokenizer.apply_chat_template( |
|
query, |
|
tokenize=True, |
|
return_tensors="pt" |
|
).to(model.device) |
|
|
|
with torch.no_grad(): |
|
outputs = model.generate( |
|
input_ids=input_ids, |
|
max_new_tokens=500, |
|
do_sample=True, |
|
temperature=0.7, |
|
top_p=0.9 |
|
) |
|
|
|
question = query[0]['content'] |
|
print(f'QUESTION: {question} \n') |
|
|
|
print('ANSWER:\n') |
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
``` |
|
|