hblim
/

bert-customer-complaints-classifier

Text Classification

customer-complaints

Model card Files Files and versions

bert-customer-complaints-classifier / README.md

hblim's picture

Update README.md

e6924d3 verified 5 months ago

|

history blame contribute delete

3.47 kB

	---
	library_name: transformers
	license: mit
	datasets:
	- hblim/customer-complaints
	language:
	- en
	metrics:
	- accuracy
	base_model:
	- google-bert/bert-base-uncased
	tags:
	- bert
	- transformers
	- customer-complaints
	- text-classification
	- multiclass
	- huggingface
	- fine-tuned
	- wandb
	---

	# BERT Base (Uncased) Fine-Tuned on Customer Complaint Classification (3 Classes)

	## 🧾 Model Description

	This model is a fine-tuned version of [`bert-base-uncased`](https://huggingface.co/bert-base-uncased) using Hugging Face Transformers on a custom dataset of customer complaints. The task is multi-class text classification, where each complaint is categorized into one of three classes.

	The model is intended to support downstream tasks like complaint triage, issue type prediction, or support ticket classification.

	Training and evaluation were tracked using [Weights & Biases](https://wandb.ai/), and all hyperparameters are reproducible and logged below.

	---

	## 🧠 Intended Use

	- 🏷 Classify customer complaint text into 3 predefined categories
	- 📊 Analyze complaint trends over time
	- 💬 Serve as a backend model for customer service applications

	---

	## 📚 Dataset

	- Dataset Name: [hblim/customer-complaints](https://huggingface.co/datasets/hblim/customer-complaints)
	- Dataset Type: Multiclass text classification
	- Classes: billing, product, delivery
	- Preprocessing: Standard BERT tokenization

	---

	## ⚙️ Training Details

	- Base Model: `bert-base-uncased`
	- Epochs: 10
	- Batch Size: 1
	- Learning Rate: 1e-5
	- Weight Decay: 0.05
	- Warmup Ratio: 0.20
	- LR Scheduler: `linear`
	- Optimizer: `AdamW`
	- Evaluation Strategy: every 100 steps
	- Logging: every 100 steps
	- Trainer: Hugging Face `Trainer`
	- Hardware: Single NVIDIA GeForce RTX 3080 GPU

	---

	## 📈 Metrics

	Evaluation was tracked using:
	- Accuracy

	To reproduce metrics and training logs, refer to the corresponding W&B run:
	[Weights & Biases Run - `baseline-hf-hub`](https://wandb.ai/notslahify/customer%20complaints%20fine%20tuning/runs/c75ddclr)


	\| Step \| Training Loss \| Validation Loss \| Accuracy \|
	\|------\|---------------\|-----------------\|------------\|
	\| 100 \| 1.106100 \| 1.040519 \| 0.523810 \|
	\| 200 \| 0.944800 \| 0.744273 \| 0.738095 \|
	\| 300 \| 0.660000 \| 0.385309 \| 0.900000 \|
	\| 400 \| 0.412400 \| 0.273423 \| 0.904762 \|
	\| 500 \| 0.220800 \| 0.185636 \| 0.923810 \|
	\| 600 \| 0.163400 \| 0.245850 \| 0.919048 \|
	\| 700 \| 0.116100 \| 0.180523 \| 0.942857 \|
	\| 800 \| 0.097200 \| 0.254475 \| 0.928571 \|
	\| 900 \| 0.052200 \| 0.233583 \| 0.942857 \|
	\| 1000 \| 0.050700 \| 0.223150 \| 0.928571 \|
	\| 1100 \| 0.035100 \| 0.271416 \| 0.919048 \|
	\| 1200 \| 0.027700 \| 0.226478 \| 0.933333 \|
	\| 1300 \| 0.009000 \| 0.218807 \| 0.938095 \|
	\| 1400 \| 0.013600 \| 0.246330 \| 0.928571 \|
	\| 1500 \| 0.014500 \| 0.226987 \| 0.933333 \|

	---

	## 🚀 How to Use

	```python
	from transformers import AutoTokenizer, AutoModelForSequenceClassification

	model = AutoModelForSequenceClassification.from_pretrained("your-username/baseline-hf-hub")
	tokenizer = AutoTokenizer.from_pretrained("your-username/baseline-hf-hub")

	inputs = tokenizer("I want to report an issue with my account", return_tensors="pt")
	outputs = model(**inputs)
	predicted_class = outputs.logits.argmax(dim=-1).item()