Upload README.md with huggingface_hub
Browse files
README.md
CHANGED
@@ -14,13 +14,15 @@ pipeline_tag: feature-extraction
|
|
14 |
|
15 |
# NILC Portuguese Word Embeddings — Wang2Vec CBOW 600d
|
16 |
|
17 |
-
|
18 |
-
|
|
|
|
|
|
|
|
|
|
|
19 |
|
20 |
-
The embeddings were trained on a large Portuguese corpus (Brazilian + European), composed of 17 corpora (~1.39B tokens).
|
21 |
-
Training was carried out with the following algorithms: **Word2Vec** [1], **FastText** [2], **Wang2Vec** [3], and **GloVe** [4].
|
22 |
|
23 |
-
This repository contains the **Wang2Vec CBOW 600d** model in **safetensors** format.
|
24 |
|
25 |
---
|
26 |
|
@@ -95,12 +97,18 @@ Hartmann, N. et al. (2017), STIL 2017.
|
|
95 |
|
96 |
### BibTeX
|
97 |
```bibtex
|
98 |
-
@inproceedings{
|
99 |
-
title={
|
100 |
-
author={Hartmann, Nathan
|
101 |
-
|
102 |
-
|
103 |
-
}
|
|
|
|
|
|
|
|
|
|
|
|
|
104 |
```
|
105 |
|
106 |
---
|
|
|
14 |
|
15 |
# NILC Portuguese Word Embeddings — Wang2Vec CBOW 600d
|
16 |
|
17 |
+
This repository contains the **Wang2Vec CBOW 600d** model in **safetensors** format.
|
18 |
+
|
19 |
+
## About
|
20 |
+
|
21 |
+
NILC-Embeddings is a repository for storing and sharing **word embeddings** for the Portuguese language. The goal is to provide ready-to-use vector resources for **Natural Language Processing (NLP)** and **Machine Learning** tasks.
|
22 |
+
|
23 |
+
The embeddings were trained on a large Portuguese corpus (Brazilian + European), composed of 17 corpora (~1.39B tokens). Training was carried out with the following algorithms: **Word2Vec**, **FastText**, **Wang2Vec**, and **GloVe**.
|
24 |
|
|
|
|
|
25 |
|
|
|
26 |
|
27 |
---
|
28 |
|
|
|
97 |
|
98 |
### BibTeX
|
99 |
```bibtex
|
100 |
+
@inproceedings{{hartmann-etal-2017-portuguese,
|
101 |
+
title = {{{{P}}ortuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks}},
|
102 |
+
author = {{Hartmann, Nathan and Fonseca, Erick and Shulby, Christopher and Treviso, Marcos and Silva, J{{'e}}ssica and Alu{{'i}}sio, Sandra}},
|
103 |
+
year = 2017,
|
104 |
+
month = oct,
|
105 |
+
booktitle = {{Proceedings of the 11th {{B}}razilian Symposium in Information and Human Language Technology}},
|
106 |
+
publisher = {{Sociedade Brasileira de Computa{{\c{{c}}}}{{\~a}}o}},
|
107 |
+
address = {{Uberl{{\^a}}ndia, Brazil}},
|
108 |
+
pages = {{122--131}},
|
109 |
+
url = {{https://aclanthology.org/W17-6615/}},
|
110 |
+
editor = {{Paetzold, Gustavo Henrique and Pinheiro, Vl{{'a}}dia}}
|
111 |
+
}}
|
112 |
```
|
113 |
|
114 |
---
|