hblim's picture
Update README.md
e6924d3 verified
---
library_name: transformers
license: mit
datasets:
- hblim/customer-complaints
language:
- en
metrics:
- accuracy
base_model:
- google-bert/bert-base-uncased
tags:
- bert
- transformers
- customer-complaints
- text-classification
- multiclass
- huggingface
- fine-tuned
- wandb
---
# BERT Base (Uncased) Fine-Tuned on Customer Complaint Classification (3 Classes)
## 🧾 Model Description
This model is a fine-tuned version of [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) using Hugging Face Transformers on a custom dataset of customer complaints. The task is **multi-class text classification**, where each complaint is categorized into one of **three classes**.
The model is intended to support downstream tasks like complaint triage, issue type prediction, or support ticket classification.
Training and evaluation were tracked using [Weights & Biases](https://wandb.ai/), and all hyperparameters are reproducible and logged below.
---
## 🧠 Intended Use
- 🏷 Classify customer complaint text into 3 predefined categories
- 📊 Analyze complaint trends over time
- 💬 Serve as a backend model for customer service applications
---
## 📚 Dataset
- Dataset Name: [hblim/customer-complaints](https://huggingface.co/datasets/hblim/customer-complaints)
- Dataset Type: Multiclass text classification
- Classes: billing, product, delivery
- Preprocessing: Standard BERT tokenization
---
## ⚙️ Training Details
- Base Model: `bert-base-uncased`
- Epochs: **10**
- Batch Size: **1**
- Learning Rate: **1e-5**
- Weight Decay: **0.05**
- Warmup Ratio: **0.20**
- LR Scheduler: `linear`
- Optimizer: `AdamW`
- Evaluation Strategy: every **100 steps**
- Logging: every **100 steps**
- Trainer: Hugging Face `Trainer`
- Hardware: Single NVIDIA GeForce RTX 3080 GPU
---
## 📈 Metrics
Evaluation was tracked using:
- **Accuracy**
To reproduce metrics and training logs, refer to the corresponding W&B run:
[Weights & Biases Run - `baseline-hf-hub`](https://wandb.ai/notslahify/customer%20complaints%20fine%20tuning/runs/c75ddclr)
| Step | Training Loss | Validation Loss | Accuracy |
|------|---------------|-----------------|------------|
| 100 | 1.106100 | 1.040519 | 0.523810 |
| 200 | 0.944800 | 0.744273 | 0.738095 |
| 300 | 0.660000 | 0.385309 | 0.900000 |
| 400 | 0.412400 | 0.273423 | 0.904762 |
| 500 | 0.220800 | 0.185636 | 0.923810 |
| 600 | 0.163400 | 0.245850 | 0.919048 |
| 700 | 0.116100 | 0.180523 | 0.942857 |
| 800 | 0.097200 | 0.254475 | 0.928571 |
| 900 | 0.052200 | 0.233583 | 0.942857 |
| 1000 | 0.050700 | 0.223150 | 0.928571 |
| 1100 | 0.035100 | 0.271416 | 0.919048 |
| 1200 | 0.027700 | 0.226478 | 0.933333 |
| 1300 | 0.009000 | 0.218807 | 0.938095 |
| 1400 | 0.013600 | 0.246330 | 0.928571 |
| 1500 | 0.014500 | 0.226987 | 0.933333 |
---
## 🚀 How to Use
```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
model = AutoModelForSequenceClassification.from_pretrained("your-username/baseline-hf-hub")
tokenizer = AutoTokenizer.from_pretrained("your-username/baseline-hf-hub")
inputs = tokenizer("I want to report an issue with my account", return_tensors="pt")
outputs = model(**inputs)
predicted_class = outputs.logits.argmax(dim=-1).item()