--- library_name: transformers license: mit datasets: - hblim/customer-complaints language: - en metrics: - accuracy base_model: - google-bert/bert-base-uncased tags: - bert - transformers - customer-complaints - text-classification - multiclass - huggingface - fine-tuned - wandb --- # BERT Base (Uncased) Fine-Tuned on Customer Complaint Classification (3 Classes) ## ๐Ÿงพ Model Description This model is a fine-tuned version of [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) using Hugging Face Transformers on a custom dataset of customer complaints. The task is **multi-class text classification**, where each complaint is categorized into one of **three classes**. The model is intended to support downstream tasks like complaint triage, issue type prediction, or support ticket classification. Training and evaluation were tracked using [Weights & Biases](https://wandb.ai/), and all hyperparameters are reproducible and logged below. --- ## ๐Ÿง  Intended Use - ๐Ÿท Classify customer complaint text into 3 predefined categories - ๐Ÿ“Š Analyze complaint trends over time - ๐Ÿ’ฌ Serve as a backend model for customer service applications --- ## ๐Ÿ“š Dataset - Dataset Name: [hblim/customer-complaints](https://huggingface.co/datasets/hblim/customer-complaints) - Dataset Type: Multiclass text classification - Classes: billing, product, delivery - Preprocessing: Standard BERT tokenization --- ## โš™๏ธ Training Details - Base Model: `bert-base-uncased` - Epochs: **10** - Batch Size: **1** - Learning Rate: **1e-5** - Weight Decay: **0.05** - Warmup Ratio: **0.20** - LR Scheduler: `linear` - Optimizer: `AdamW` - Evaluation Strategy: every **100 steps** - Logging: every **100 steps** - Trainer: Hugging Face `Trainer` - Hardware: Single NVIDIA GeForce RTX 3080 GPU --- ## ๐Ÿ“ˆ Metrics Evaluation was tracked using: - **Accuracy** To reproduce metrics and training logs, refer to the corresponding W&B run: [Weights & Biases Run - `baseline-hf-hub`](https://wandb.ai/notslahify/customer%20complaints%20fine%20tuning/runs/c75ddclr) | Step | Training Loss | Validation Loss | Accuracy | |------|---------------|-----------------|------------| | 100 | 1.106100 | 1.040519 | 0.523810 | | 200 | 0.944800 | 0.744273 | 0.738095 | | 300 | 0.660000 | 0.385309 | 0.900000 | | 400 | 0.412400 | 0.273423 | 0.904762 | | 500 | 0.220800 | 0.185636 | 0.923810 | | 600 | 0.163400 | 0.245850 | 0.919048 | | 700 | 0.116100 | 0.180523 | 0.942857 | | 800 | 0.097200 | 0.254475 | 0.928571 | | 900 | 0.052200 | 0.233583 | 0.942857 | | 1000 | 0.050700 | 0.223150 | 0.928571 | | 1100 | 0.035100 | 0.271416 | 0.919048 | | 1200 | 0.027700 | 0.226478 | 0.933333 | | 1300 | 0.009000 | 0.218807 | 0.938095 | | 1400 | 0.013600 | 0.246330 | 0.928571 | | 1500 | 0.014500 | 0.226987 | 0.933333 | --- ## ๐Ÿš€ How to Use ```python from transformers import AutoTokenizer, AutoModelForSequenceClassification model = AutoModelForSequenceClassification.from_pretrained("your-username/baseline-hf-hub") tokenizer = AutoTokenizer.from_pretrained("your-username/baseline-hf-hub") inputs = tokenizer("I want to report an issue with my account", return_tensors="pt") outputs = model(**inputs) predicted_class = outputs.logits.argmax(dim=-1).item()