VinaR - Repository of the Vinča Nuclear Institute
    • English
    • Српски
    • Српски (Serbia)
  • English 
    • English
    • Serbian (Cyrillic)
    • Serbian (Latin)
  • Login
View Item 
  •   Vinar
  • Vinča
  • Research Data
  • View Item
  •   Vinar
  • Vinča
  • Research Data
  • View Item
JavaScript is disabled for your browser. Some features of this site may not work without it.

Domain-specific Word2vec-based trained models - Chem300, Phys300, MatSci200, MatSci300, and Mixed300

No Thumbnail
Authors
Radaković, Jana
Batalović, Katarina
Model (Published version)
Metadata
Show full item record
Abstract
The repository contains materials science, chemistry, and physics-specialized unsupervised trained models. Word embeddings are generated by means of the Word2vec, a natural language processing technique comprised of language model architectures for fast and efficient learning of distributed representations of words. Continuous Skip-gram model architecture with a negative sampling strategy, as implemented in the Gensim library, is employed for model training. The word embeddings consisting of 200 and 300 vectorial components for materials science and 300 vectorial components for chemistry, physics, and mixed domain are here provided.
Keywords:
word2vec model / In-silico materials design / Word embeddings as autonomous predictors / Static word embeddings / Word embeddings variability / Materials stability / Materials informatics / Digital design / Cheminformatics / Natural language processing
Source:
figshare, 2025
Note:
  • This digital object is hosted on the Figshare server due to its size and is available under the Creative Commons Attribution 4.0 International License.

DOI: 10.6084/m9.figshare.28740122.v1

[ Google Scholar ]
URI
https://vinar.vin.bg.ac.rs/handle/123456789/15178
Collections
  • Research Data
Institution/Community
Vinča
TY  - GEN
AU  - Radaković, Jana
AU  - Batalović, Katarina
PY  - 2025
UR  - https://vinar.vin.bg.ac.rs/handle/123456789/15178
AB  - The repository contains materials science, chemistry, and physics-specialized unsupervised trained models. Word embeddings are generated by means of the Word2vec, a natural language processing technique comprised of language model architectures for fast and efficient learning of distributed representations of words. Continuous Skip-gram model architecture with a negative sampling strategy, as implemented in the Gensim library, is employed for model training. The word embeddings consisting of 200 and 300 vectorial components for materials science and 300 vectorial components for chemistry, physics, and mixed domain are here provided.
T2  - figshare
T1  - Domain-specific Word2vec-based trained models - Chem300, Phys300, MatSci200, MatSci300, and Mixed300
DO  - 10.6084/m9.figshare.28740122.v1
ER  - 
@misc{
author = "Radaković, Jana and Batalović, Katarina",
year = "2025",
abstract = "The repository contains materials science, chemistry, and physics-specialized unsupervised trained models. Word embeddings are generated by means of the Word2vec, a natural language processing technique comprised of language model architectures for fast and efficient learning of distributed representations of words. Continuous Skip-gram model architecture with a negative sampling strategy, as implemented in the Gensim library, is employed for model training. The word embeddings consisting of 200 and 300 vectorial components for materials science and 300 vectorial components for chemistry, physics, and mixed domain are here provided.",
journal = "figshare",
title = "Domain-specific Word2vec-based trained models - Chem300, Phys300, MatSci200, MatSci300, and Mixed300",
doi = "10.6084/m9.figshare.28740122.v1"
}
Radaković, J.,& Batalović, K.. (2025). Domain-specific Word2vec-based trained models - Chem300, Phys300, MatSci200, MatSci300, and Mixed300. in figshare.
https://doi.org/10.6084/m9.figshare.28740122.v1
Radaković J, Batalović K. Domain-specific Word2vec-based trained models - Chem300, Phys300, MatSci200, MatSci300, and Mixed300. in figshare. 2025;.
doi:10.6084/m9.figshare.28740122.v1 .
Radaković, Jana, Batalović, Katarina, "Domain-specific Word2vec-based trained models - Chem300, Phys300, MatSci200, MatSci300, and Mixed300" in figshare (2025),
https://doi.org/10.6084/m9.figshare.28740122.v1 . .

DSpace software copyright © 2002-2015  DuraSpace
About the VinaR Repository | Send Feedback

re3dataOpenAIRERCUB
 

 

All of DSpaceCommunitiesAuthorsTitlesSubjectsThis institutionAuthorsTitlesSubjects

Statistics

View Usage Statistics

DSpace software copyright © 2002-2015  DuraSpace
About the VinaR Repository | Send Feedback

re3dataOpenAIRERCUB