Change in v0.5.1
- Add normalizetotextmodel_doc2vec()and
pass it toas.matrix().
- Add weightstotextmodel_doc2vec()to
adjust the salience of words in the document vectors.
- Add include_datatotextmodel_word2vec()to save the original tokens object.
Changes in v0.5.0
- Add the modelargument totextmodel_word2vec()to update existing models.
- The normalizeargument is moved fromtextmodel_word2vec()toas.matrix(). The
original argument is deprecated and set toFALSEby
default.
- Remove weights().
- Improve the structure of C++ code.
Changes in v0.4.0
- Add the tolowerargument and set toTRUEto lower-case tokens.
- Allow xto be quanteda’s tokens_xptr object to enhance
efficiency.
Changes in v0.3.0
- Save docvars in the textmodel_doc2vecobjects.
- Set zero for empty documents in the textmodel_doc2vecobjects.
- Add probability()to compute probability of words.
Changes in v0.2.0
- Rename word2vec(),doc2vec()andlsa()totextmodel_word2vec(),textmodel_doc2vec()andtextmodel_lsa()respectively.
- Simplify the C++ code to make maintenance easier.
- Add normalizetoword2vecto disable or
enable word vector normalization.
- Add weights()to extract back-propagation weights.
- Make analogy()to convert a formula to named character
vector.
- Improve the stability of word2vec()whenverbose = TRUE.
Changes in v0.1.0
- Fork https://github.com/bnosac/word2vec and change the package name
to wordvector.
- Replace a list of character with quanteda’s tokens
object as an input object.
- Recreate word2vec()with new argument names and object
structures.
- Create lda()to train word vectors using Latent
Semantic Analysis.
- Add similarity()andanalogy()functions
using proxyC.
- Add data_corpus_news2014that contain 20,000 news
summaries as package data.