Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -59,53 +59,15 @@ This is a fine-tuned version of the Llama-3.1 model specifically optimized for D
|
|
59 |
- **Framework**: MLX-LM
|
60 |
- **Hardware**: Apple M1 Max (64GB RAM)
|
61 |
|
62 |
-
##
|
63 |
|
64 |
-
###
|
65 |
-
|
66 |
-
1. **Election Security**: Detect and classify disinformation campaigns targeting electoral processes
|
67 |
-
2. **Content Moderation**: Identify harmful content that undermines electoral integrity
|
68 |
-
3. **Research**: Academic research on disinformation patterns and meta-narratives
|
69 |
-
4. **Policy Analysis**: Support policy development for election security measures
|
70 |
-
|
71 |
-
### Target Applications
|
72 |
-
|
73 |
-
- Social media monitoring platforms
|
74 |
-
- Election security organizations
|
75 |
-
- Fact-checking organizations
|
76 |
-
- Academic research institutions
|
77 |
-
- Government agencies
|
78 |
-
- Civil society organizations
|
79 |
-
|
80 |
-
## Training Data
|
81 |
-
|
82 |
-
The model was trained on the [DISARM Election Watch Dataset](https://huggingface.co/datasets/ArapCheruiyot/disarm-election-watch-dataset), which contains:
|
83 |
-
|
84 |
-
### Data Sources
|
85 |
-
- **Telegram**: 3,632 examples (60.3%)
|
86 |
-
- **X/Twitter**: 2,038 examples (33.9%)
|
87 |
-
- **TikTok**: 248 examples (4.1%)
|
88 |
-
- **DISARM**: 101 examples (1.7%)
|
89 |
-
|
90 |
-
### Task Types
|
91 |
-
- **DISARM Classification**: 101 examples
|
92 |
-
- **Content Analysis**: 5,770 examples
|
93 |
-
- **Narrative Analysis**: 148 examples
|
94 |
-
|
95 |
-
### Data Split
|
96 |
-
- **Training**: 4,815 examples (80%)
|
97 |
-
- **Validation**: 601 examples (10%)
|
98 |
-
- **Test**: 603 examples (10%)
|
99 |
-
|
100 |
-
## Usage
|
101 |
-
|
102 |
-
### With MLX-LM (Fused Model)
|
103 |
|
104 |
```python
|
105 |
from mlx_lm import load, generate
|
106 |
|
107 |
# Load the complete fine-tuned model
|
108 |
-
model, tokenizer = load("
|
109 |
|
110 |
# Example prompt
|
111 |
prompt = """### Instruction:
|
@@ -121,34 +83,36 @@ response = generate(model, tokenizer, prompt, max_tokens=256, temp=0.1)
|
|
121 |
print(response)
|
122 |
```
|
123 |
|
124 |
-
###
|
125 |
|
126 |
-
```
|
127 |
-
|
|
|
128 |
|
129 |
-
#
|
130 |
-
|
131 |
-
|
132 |
|
133 |
-
|
134 |
-
|
135 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
136 |
```
|
137 |
|
138 |
-
###
|
139 |
|
140 |
```json
|
141 |
{
|
142 |
"meta_narrative": "Compromised Election Technology",
|
143 |
-
"primary_disarm_technique": "T0022.
|
144 |
-
"confidence_score": 0.
|
145 |
-
"key_indicators": [
|
146 |
-
"BVAS",
|
147 |
-
"INEC",
|
148 |
-
"pre-loaded",
|
149 |
-
"rigged",
|
150 |
-
"incumbent"
|
151 |
-
],
|
152 |
"platform": "WhatsApp",
|
153 |
"language": "en",
|
154 |
"category": "Undermining Electoral Institutions"
|
@@ -173,25 +137,6 @@ print(response)
|
|
173 |
- **Metal GPU**: Accelerated inference
|
174 |
- **Memory Management**: 16GB wired memory optimization
|
175 |
|
176 |
-
## Limitations and Biases
|
177 |
-
|
178 |
-
### Known Limitations
|
179 |
-
1. **Language**: Trained primarily on English content
|
180 |
-
2. **Geographic Focus**: Primarily Nigerian election context
|
181 |
-
3. **Platform Bias**: Limited to specific social media platforms
|
182 |
-
4. **Temporal Context**: Training data from specific election periods
|
183 |
-
|
184 |
-
### Potential Biases
|
185 |
-
1. **Cultural Context**: May not generalize to other cultural contexts
|
186 |
-
2. **Platform-Specific**: May not capture platform-specific nuances
|
187 |
-
3. **Evolving Tactics**: May not capture new disinformation techniques
|
188 |
-
|
189 |
-
### Ethical Considerations
|
190 |
-
1. **Privacy**: Ensure compliance with data protection regulations
|
191 |
-
2. **Transparency**: Use responsibly with clear disclosure of AI involvement
|
192 |
-
3. **Bias Mitigation**: Regular evaluation for unintended biases
|
193 |
-
4. **Human Oversight**: Always maintain human oversight in critical applications
|
194 |
-
|
195 |
## Model Files
|
196 |
|
197 |
### Fused Model (Complete)
|
@@ -204,43 +149,15 @@ print(response)
|
|
204 |
- **Format**: safetensors
|
205 |
- **Files**: Final adapters + training checkpoints
|
206 |
|
207 |
-
|
208 |
-
- **Frequency**: Every 100 iterations
|
209 |
-
- **Purpose**: Model evaluation and recovery
|
210 |
-
- **Format**: safetensors
|
211 |
-
|
212 |
-
## Citation
|
213 |
|
214 |
-
|
215 |
-
|
216 |
-
|
217 |
-
|
218 |
-
title={DISARM Election Watch: Fine-tuned Llama-3.1 for Election Disinformation Detection},
|
219 |
-
author={ArapCheruiyot},
|
220 |
-
year={2024},
|
221 |
-
url={https://huggingface.co/ArapCheruiyot/disarm-ew-llama3-finetuned}
|
222 |
-
}
|
223 |
-
```
|
224 |
-
|
225 |
-
## License
|
226 |
-
|
227 |
-
This model is licensed under the MIT License. See the [LICENSE](LICENSE) file for details.
|
228 |
-
|
229 |
-
## Acknowledgments
|
230 |
-
|
231 |
-
- **DISARM Framework**: For the classification methodology
|
232 |
-
- **MLX-LM**: For the fine-tuning framework
|
233 |
-
- **Apple**: For Apple Silicon optimization
|
234 |
-
- **Hugging Face**: For model hosting and distribution
|
235 |
|
236 |
## Contact
|
237 |
|
238 |
For questions, issues, or collaboration opportunities:
|
239 |
- **Model Repository**: [ArapCheruiyot/disarm-ew-llama3-finetuned](https://huggingface.co/ArapCheruiyot/disarm-ew-llama3-finetuned)
|
240 |
- **Dataset Repository**: [ArapCheruiyot/disarm-election-watch-dataset](https://huggingface.co/datasets/ArapCheruiyot/disarm-election-watch-dataset)
|
241 |
-
|
242 |
-
## Version History
|
243 |
-
|
244 |
-
- **v1.0.0**: Initial release with 600 training iterations
|
245 |
-
- **Training Data**: 6,019 examples from multiple platforms
|
246 |
-
- **Framework**: MLX-LM with Apple Silicon optimization
|
|
|
59 |
- **Framework**: MLX-LM
|
60 |
- **Hardware**: Apple M1 Max (64GB RAM)
|
61 |
|
62 |
+
## Quick Start
|
63 |
|
64 |
+
### Using with MLX-LM
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
65 |
|
66 |
```python
|
67 |
from mlx_lm import load, generate
|
68 |
|
69 |
# Load the complete fine-tuned model
|
70 |
+
model, tokenizer = load("models/disarm_ew_llama3_finetuned")
|
71 |
|
72 |
# Example prompt
|
73 |
prompt = """### Instruction:
|
|
|
83 |
print(response)
|
84 |
```
|
85 |
|
86 |
+
### Using with Ollama
|
87 |
|
88 |
+
```bash
|
89 |
+
# Create Ollama model
|
90 |
+
ollama create disarm-ew-llama3-finetuned -f Modelfile
|
91 |
|
92 |
+
# Run the model
|
93 |
+
ollama run disarm-ew-llama3-finetuned "Your prompt here"
|
94 |
+
```
|
95 |
|
96 |
+
### Example Usage
|
97 |
+
|
98 |
+
```bash
|
99 |
+
ollama run disarm-ew-llama3-finetuned "### Instruction:
|
100 |
+
Classify the following content according to DISARM Framework techniques and meta-narratives:
|
101 |
+
|
102 |
+
### Input:
|
103 |
+
A viral WhatsApp broadcast claims that the BVAS machines have been pre-loaded with votes by INEC in favour of the incumbent party.
|
104 |
+
|
105 |
+
### Response:"
|
106 |
```
|
107 |
|
108 |
+
### Expected Output
|
109 |
|
110 |
```json
|
111 |
{
|
112 |
"meta_narrative": "Compromised Election Technology",
|
113 |
+
"primary_disarm_technique": "T0022.001: Develop False Conspiracy Theory Narratives about Electoral Manipulation and Compromise",
|
114 |
+
"confidence_score": 0.98,
|
115 |
+
"key_indicators": ["BVAS", "pre-loaded", "INEC"],
|
|
|
|
|
|
|
|
|
|
|
|
|
116 |
"platform": "WhatsApp",
|
117 |
"language": "en",
|
118 |
"category": "Undermining Electoral Institutions"
|
|
|
137 |
- **Metal GPU**: Accelerated inference
|
138 |
- **Memory Management**: 16GB wired memory optimization
|
139 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
140 |
## Model Files
|
141 |
|
142 |
### Fused Model (Complete)
|
|
|
149 |
- **Format**: safetensors
|
150 |
- **Files**: Final adapters + training checkpoints
|
151 |
|
152 |
+
## Local Deployment Benefits
|
|
|
|
|
|
|
|
|
|
|
153 |
|
154 |
+
- **Privacy**: Run locally without sending data to external servers
|
155 |
+
- **Speed**: Fast inference on local hardware
|
156 |
+
- **Customization**: Modify prompts and parameters as needed
|
157 |
+
- **Offline**: Works without internet connection
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
158 |
|
159 |
## Contact
|
160 |
|
161 |
For questions, issues, or collaboration opportunities:
|
162 |
- **Model Repository**: [ArapCheruiyot/disarm-ew-llama3-finetuned](https://huggingface.co/ArapCheruiyot/disarm-ew-llama3-finetuned)
|
163 |
- **Dataset Repository**: [ArapCheruiyot/disarm-election-watch-dataset](https://huggingface.co/datasets/ArapCheruiyot/disarm-election-watch-dataset)
|
|
|
|
|
|
|
|
|
|
|
|