t0m-R commited on
Commit
a20e54a
·
1 Parent(s): ac2dc19

Upload Vit-B/8 SEM scale classification model

Browse files
Files changed (3) hide show
  1. README.md +81 -0
  2. config.json +47 -0
  3. model.safetensors +3 -0
README.md ADDED
@@ -0,0 +1,81 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ language: en
4
+ tags:
5
+ - image-classification
6
+ - vision-transformer
7
+ - pytorch
8
+ - sem
9
+ - materials-science
10
+ - nffa-di
11
+ base_model: timm/vit_base_patch8_224.augreg2_in21k_ft_in1k
12
+ pipeline_tag: image-classification
13
+ ---
14
+
15
+ # Vision Transformer for SEM Image Scale Classification
16
+
17
+ This is a fine-tuned **Vision Transformer (ViT-B/8)** model for classifying the magnification scale of Scanning Electron Microscopy (SEM) images—**pico, nano, or micro**—directly from pixel data.
18
+
19
+ The model addresses the challenge of unreliable scale information in large SEM archives, which is often hindered by proprietary file formats or error-prone Optical Character Recognition (OCR).
20
+
21
+ This model was developed as part of the **NFFA-DI (Nano Foundries and Fine Analysis Digital Infrastructure)** project, funded by the European Union's NextGenerationEU program.
22
+
23
+ ## Model Description
24
+
25
+ The model is based on the `timm/vit_base_patch8_224.augreg2_in21k_ft_in1k` checkpoint and has been fine-tuned for a 3-class image classification task on SEM images. The three scale categories are:
26
+
27
+ 1. **Pico**: Images where the pixel size is in the atomic or sub-nanometer scale (less than 1 nm).
28
+ 2. **Nano**: Images where the pixel size is in the nanometer range (1 nm to 1,000 nm, or 1 µm).
29
+ 3. **Micro**: Images where the pixel size is in the micrometer scale (greater than 1 µm).
30
+
31
+ ## Model Performance
32
+
33
+ The model achieves **91,7% accuracy** on a held-out test set. Notably, most misclassifications occur at the transitional nano-micro boundary, which indicates that the model is learning physically meaningful feature representations related to the magnification level.
34
+
35
+ ## How to Use
36
+
37
+ The following Python code shows how to load the model and its processor from the Hub and use it to classify a local SEM image.
38
+
39
+ ```python
40
+ from transformers import AutoImageProcessor, AutoModelForImageClassification
41
+ from PIL import Image
42
+ import torch
43
+
44
+ # Load the model and image processor from the Hub
45
+ model_name = "t0m-R/vit-sem-scale-classifier"
46
+ image_processor = AutoImageProcessor.from_pretrained(model_name)
47
+ model = AutoModelForImageClassification.from_pretrained(model_name)
48
+
49
+ # Load and preprocess the image
50
+ image_path = "path/to/your/sem_image.png"
51
+ try:
52
+ image = Image.open(image_path).convert("RGB")
53
+
54
+ # Prepare the image for the model
55
+ inputs = image_processor(images=image, return_tensors="pt")
56
+
57
+ # Run inference
58
+ with torch.no_grad():
59
+ logits = model(**inputs).logits
60
+ predicted_label_id = logits.argmax(-1).item()
61
+ predicted_label = model.config.id2label[predicted_label_id]
62
+
63
+ print(f"Predicted Scale: {predicted_label}")
64
+
65
+ except FileNotFoundError:
66
+ print(f"Error: The file at {image_path} was not found.")
67
+ ```
68
+ ## Training Data
69
+
70
+ This model was fine-tuned on a custom dataset of 17,700 Scanning Electron Microscopy (SEM) images, curated specifically for this project.
71
+ The images were selected to create a balanced dataset for the task of scale classification. This set contains an equal one-third split of images corresponding to the pico, nano, and micro scales (5,900 images per class).
72
+
73
+ The 17,700 images were then divided into:
74
+
75
+ Training set: 12,000 images
76
+
77
+ Validation set: 3,000 images
78
+
79
+ Test set: 2,700 images
80
+
81
+ **Note on Availability**: This dataset is not publicly available at the moment but is planned for publication at a later stage. Please check this model card for future updates on data access.
config.json ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architecture": "vit_base_patch8_224",
3
+ "architectures": [
4
+ "TimmWrapperForImageClassification"
5
+ ],
6
+ "do_pooling": true,
7
+ "dtype": "float32",
8
+ "global_pool": "token",
9
+ "initializer_range": 0.02,
10
+ "label_names": [
11
+ "pico",
12
+ "nano",
13
+ "micro"
14
+ ],
15
+ "model_args": null,
16
+ "model_type": "timm_wrapper",
17
+ "num_classes": 3,
18
+ "num_features": 768,
19
+ "pretrained_cfg": {
20
+ "classifier": "head",
21
+ "crop_mode": "center",
22
+ "crop_pct": 0.9,
23
+ "custom_load": false,
24
+ "first_conv": "patch_embed.proj",
25
+ "fixed_input_size": true,
26
+ "input_size": [
27
+ 3,
28
+ 224,
29
+ 224
30
+ ],
31
+ "interpolation": "bicubic",
32
+ "mean": [
33
+ 0.5,
34
+ 0.5,
35
+ 0.5
36
+ ],
37
+ "pool_size": null,
38
+ "std": [
39
+ 0.5,
40
+ 0.5,
41
+ 0.5
42
+ ],
43
+ "tag": "augreg2_in21k_ft_in1k"
44
+ },
45
+ "problem_type": "single_label_classification",
46
+ "transformers_version": "4.56.0"
47
+ }
model.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:cae76191450cf0c7b6c4f177e44433046a2dc4a69fd1eecefb28d70a3dd77826
3
+ size 343254828