|  | --- | 
					
						
						|  | license: mit | 
					
						
						|  | language: | 
					
						
						|  | - en | 
					
						
						|  | library_name: fasttext | 
					
						
						|  | tags: | 
					
						
						|  | - schema | 
					
						
						|  | - word-embeddings | 
					
						
						|  | - embeddings | 
					
						
						|  | - fasttext | 
					
						
						|  | - unsupervised-learning | 
					
						
						|  | - tables | 
					
						
						|  | - web-table | 
					
						
						|  | - schema-data | 
					
						
						|  | --- | 
					
						
						|  | # Pre-trained Web Table Embeddings | 
					
						
						|  |  | 
					
						
						|  | The models here represent schema terms and instance data terms in a semantic vector space making them especially useful for representing schema and class information as well as for ML tasks on tabular text data. | 
					
						
						|  |  | 
					
						
						|  | The code for executing and evaluating the models is located in the [table-embeddings Github repository](https://github.com/guenthermi/table-embeddings) | 
					
						
						|  |  | 
					
						
						|  | ## Quick Start | 
					
						
						|  |  | 
					
						
						|  | You can install the table_embeddings package to encode text from tables by running the following commands: | 
					
						
						|  |  | 
					
						
						|  |  | 
					
						
						|  | ```bash | 
					
						
						|  | pip install cython | 
					
						
						|  | pip install git+https://github.com/guenthermi/table-embeddings.git | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | After that you can encode text with the following Python snippet: | 
					
						
						|  |  | 
					
						
						|  | ```python | 
					
						
						|  | from table_embeddings import TableEmbeddingModel | 
					
						
						|  | model = TableEmbeddingModel.load_model('ddrg/web_table_embeddings_combo150') | 
					
						
						|  | embedding = model.get_header_vector('headline') | 
					
						
						|  | ``` | 
					
						
						|  |  | 
					
						
						|  | ## Model Types | 
					
						
						|  |  | 
					
						
						|  | | Model Type | Description | Download-Links | | 
					
						
						|  | | ---------- | ----------- | -------------- | | 
					
						
						|  | | W-tax      | Model of relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_tax64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_tax150)) | 
					
						
						|  | | W-row      | Model of row-wise relations in tables | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_row64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_row150)) | 
					
						
						|  | | W-combo      | Model of row-wise relations and relations between table header and table body | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_combo64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_combo150)) | 
					
						
						|  | | W-plain      | Model of row-wise relations in tables without pre-processing | ([64dim](https://huggingface.co/ddrg/web_table_embeddings_plain64), [150dim](https://huggingface.co/ddrg/web_table_embeddings_plain150)) | 
					
						
						|  |  | 
					
						
						|  | ## More Information | 
					
						
						|  |  | 
					
						
						|  | For examples on how to use the models, you can take a look at the [Github repository](https://github.com/guenthermi/table-embeddings) | 
					
						
						|  |  | 
					
						
						|  | More information can be found in the paper [Pre-Trained Web Table Embeddings for Table Discovery](https://dl.acm.org/doi/10.1145/3464509.3464892) | 
					
						
						|  | ``` | 
					
						
						|  | @inproceedings{gunther2021pre, | 
					
						
						|  | title={Pre-Trained Web Table Embeddings for Table Discovery}, | 
					
						
						|  | author={G{\"u}nther, Michael and Thiele, Maik and Gonsior, Julius and Lehner, Wolfgang}, | 
					
						
						|  | booktitle={Fourth Workshop in Exploiting AI Techniques for Data Management}, | 
					
						
						|  | pages={24--31}, | 
					
						
						|  | year={2021} | 
					
						
						|  | } | 
					
						
						|  | ``` |