mtreviso commited on
Commit
876aae1
·
verified ·
1 Parent(s): 5c6b4e6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +19 -11
README.md CHANGED
@@ -14,13 +14,15 @@ pipeline_tag: feature-extraction
14
 
15
  # NILC Portuguese Word Embeddings — Wang2Vec CBOW 600d
16
 
17
- NILC-Embeddings is a repository for storing and sharing **word embeddings** for the Portuguese language.
18
- The goal is to provide ready-to-use vector resources for **Natural Language Processing (NLP)** and **Machine Learning** tasks.
 
 
 
 
 
19
 
20
- The embeddings were trained on a large Portuguese corpus (Brazilian + European), composed of 17 corpora (~1.39B tokens).
21
- Training was carried out with the following algorithms: **Word2Vec** [1], **FastText** [2], **Wang2Vec** [3], and **GloVe** [4].
22
 
23
- This repository contains the **Wang2Vec CBOW 600d** model in **safetensors** format.
24
 
25
  ---
26
 
@@ -95,12 +97,18 @@ Hartmann, N. et al. (2017), STIL 2017.
95
 
96
  ### BibTeX
97
  ```bibtex
98
- @inproceedings{hartmann2017nilc,
99
- title={Portuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks},
100
- author={Hartmann, Nathan and Fonseca, Erick and Shulby, Christopher and Treviso, Marcos and Rodrigues, Juliano and Aluísio, Sandra},
101
- booktitle={Proceedings of the Symposium in Information and Human Language Technology (STIL)},
102
- year={2017}
103
- }
 
 
 
 
 
 
104
  ```
105
 
106
  ---
 
14
 
15
  # NILC Portuguese Word Embeddings — Wang2Vec CBOW 600d
16
 
17
+ This repository contains the **Wang2Vec CBOW 600d** model in **safetensors** format.
18
+
19
+ ## About
20
+
21
+ NILC-Embeddings is a repository for storing and sharing **word embeddings** for the Portuguese language. The goal is to provide ready-to-use vector resources for **Natural Language Processing (NLP)** and **Machine Learning** tasks.
22
+
23
+ The embeddings were trained on a large Portuguese corpus (Brazilian + European), composed of 17 corpora (~1.39B tokens). Training was carried out with the following algorithms: **Word2Vec**, **FastText**, **Wang2Vec**, and **GloVe**.
24
 
 
 
25
 
 
26
 
27
  ---
28
 
 
97
 
98
  ### BibTeX
99
  ```bibtex
100
+ @inproceedings{{hartmann-etal-2017-portuguese,
101
+ title = {{{{P}}ortuguese Word Embeddings: Evaluating on Word Analogies and Natural Language Tasks}},
102
+ author = {{Hartmann, Nathan and Fonseca, Erick and Shulby, Christopher and Treviso, Marcos and Silva, J{{'e}}ssica and Alu{{'i}}sio, Sandra}},
103
+ year = 2017,
104
+ month = oct,
105
+ booktitle = {{Proceedings of the 11th {{B}}razilian Symposium in Information and Human Language Technology}},
106
+ publisher = {{Sociedade Brasileira de Computa{{\c{{c}}}}{{\~a}}o}},
107
+ address = {{Uberl{{\^a}}ndia, Brazil}},
108
+ pages = {{122--131}},
109
+ url = {{https://aclanthology.org/W17-6615/}},
110
+ editor = {{Paetzold, Gustavo Henrique and Pinheiro, Vl{{'a}}dia}}
111
+ }}
112
  ```
113
 
114
  ---