File size: 3,274 Bytes

c4bc283

---
license: apache-2.0
language:
- en
base_model:
- microsoft/deberta-v3-large
- HuggingFaceTB/SmolLM2-135M-Instruct
pipeline_tag: token-classification
tags:
- NER
- encoder
- decoder
- GLiNER
- information-extraction
---

![image/png](https://cdn-uploads.huggingface.co/production/uploads/6405f62ba577649430be5124/V5nB1X_qdyTtyTUZHYYHk.png)

**GLiNER** is a Named Entity Recognition (NER) model capable of identifying *any* entity type in a **zero-shot** manner.
This architecture combines:

* An **encoder** for representing entity spans
* A **decoder** for generating label names

This hybrid approach enables new use cases such as **entity linking** and expands GLiNER’s capabilities.
By integrating large modern decoders—trained on vast datasets—GLiNER can leverage their **richer knowledge capacity** while maintaining competitive inference speed.

---

## Key Features

* **Open ontology**: Works when the label set is unknown
* **Multi-label entity recognition**: Assign multiple labels to a single entity
* **Entity linking**: Handle large label sets via constrained generation
* **Knowledge expansion**: Gain from large decoder models
* **Efficient**: Minimal speed reduction on GPU compared to single-encoder GLiNER

---

## Installation

Update to the latest version of GLiNER:

```bash
pip install -U gliner
```

---

## Usage

```python
from gliner import GLiNER

model = GLiNER.from_pretrained("gliner-decoder-large-v1.0")

text = (
    "Apple was founded as Apple Computer Company on April 1, 1976, "
    "by Steve Wozniak, Steve Jobs (1955–2011) and Ronald Wayne to "
    "develop and sell Wozniak's Apple I personal computer."
)

labels = ["person", "other"]

model.run(text, labels, threshold=0.3, num_gen_sequences=1)
```

---

### Example Output

```json
[
  [
    {
      "start": 21,
      "end": 26,
      "text": "Apple",
      "label": "other",
      "score": 0.6795641779899597,
      "generated labels": ["Organization"]
    },
    {
      "start": 47,
      "end": 60,
      "text": "April 1, 1976",
      "label": "other",
      "score": 0.44296327233314514,
      "generated labels": ["Date"]
    },
    {
      "start": 65,
      "end": 78,
      "text": "Steve Wozniak",
      "label": "person",
      "score": 0.9934439659118652,
      "generated labels": ["Person"]
    },
    {
      "start": 80,
      "end": 90,
      "text": "Steve Jobs",
      "label": "person",
      "score": 0.9725918769836426,
      "generated labels": ["Person"]
    },
    {
      "start": 107,
      "end": 119,
      "text": "Ronald Wayne",
      "label": "person",
      "score": 0.9964536428451538,
      "generated labels": ["Person"]
    }
  ]
]
```

---

### Restricting the Decoder

You can limit the decoder to generate labels only from a predefined set:

```python
model.run(
    text, labels,
    threshold=0.3,
    num_gen_sequences=1,
    gen_constraints=[
        "organization", "organization type", "city",
        "technology", "date", "person"
    ]
)
```

---

## Performance Tips

Two label trie implementations are available.
For a **faster, memory-efficient C++ version**, install **Cython**:

```bash
pip install cython
```

This can significantly improve performance and reduce memory usage, especially with millions of labels.