--- license: apache-2.0 language: - en base_model: - mistralai/Mistral-7B-Instruct-v0.3 pipeline_tag: question-answering --- # Mistral-7B-Instruct-v0.3-EOSC Federated fine tuned version using data from the EOSC registry. Federated training configuration: - model.name = "mistralai/Mistral-7B-Instruct-v0.3" - model.quantization = 4 - model.gradient-checkpointing = true - model.lora.peft-lora-r = 32 - model.lora.peft-lora-alpha = 64 - train.save-every-round = 5 - train.learning-rate-max = 5e-5 - train.learning-rate-min = 1e-6 - train.seq-length = 512 - train.training-arguments.per-device-train-batch-size = 16 - train.training-arguments.gradient-accumulation-steps = 1 - train.training-arguments.logging-steps = 10 - train.training-arguments.num-train-epochs = 2 - train.training-arguments.max-steps = 10 - train.training-arguments.save-steps = 1000 - train.training-arguments.save-total-limit = 10 - train.training-arguments.gradient-checkpointing = true - train.training-arguments.lr-scheduler-type = "constant" - strategy.fraction-fit = 0.1 - strategy.fraction-evaluate = 0.0 - num-server-rounds = 10 The PEFT presented in this model corresponds to 5 rounds of the FL training, The following `bitsandbytes` quantization config was used during training: - quant_method: QuantizationMethod.BITS_AND_BYTES - _load_in_8bit: False - _load_in_4bit: True - llm_int8_threshold: 6.0 - llm_int8_skip_modules: None - llm_int8_enable_fp32_cpu_offload: False - llm_int8_has_fp16_weight: False - bnb_4bit_quant_type: fp4 - bnb_4bit_use_double_quant: False - bnb_4bit_compute_dtype: float32 - bnb_4bit_quant_storage: uint8 - load_in_4bit: True - load_in_8bit: False ### Framework versions - PEFT 0.6.2 ### Try the model! ```python from transformers import AutoModelForCausalLM, AutoTokenizer from peft import PeftModel import torch base_model = "mistralai/Mistral-7B-Instruct-v0.3" adapter_model = 'ifca-advanced-computing/Mistral-7B-Instruct-v0.3-EOSC' model = AutoModelForCausalLM.from_pretrained(base_model) model = PeftModel.from_pretrained(model, adapter_model) tokenizer = AutoTokenizer.from_pretrained(base_model) model.eval() query = [ {"role": "user", "content": "What is the EOSC?"}, ] input_ids = tokenizer.apply_chat_template( query, tokenize=True, return_tensors="pt" ).to(model.device) with torch.no_grad(): outputs = model.generate( input_ids=input_ids, max_new_tokens=500, do_sample=True, temperature=0.7, top_p=0.9 ) question = query[0]['content'] print(f'QUESTION: {question} \n') print('ANSWER:\n') print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ```