sentometrics 1.0.1
- Fixed NOTEs about Rd files with link targets missing package anchors
as per CRAN’s request.
sentometrics 1.0.0
- Version bump associated with publication of vignette in the Journal
of Statistical Software.
sentometrics 0.8.4
- Alignment with released quantedav3.0.
sentometrics 0.8.3
- Features (or docvars) with names "id","sentence_id","date","word_count"or"texts"will not be accepted
even whennumeric, to avoid duplicate column names down the
line. A clear error message is issued to alert users.
- Replacement of order()calls ondata.frames where needed to avoid CRAN complaints.
- Some small documentation fixes.
sentometrics 0.8.2
- Some documentation fixes.
- Release of a pkgdownwebsite.
- Fixed bug in sento_corpus()function that did not
always order input correctly by date.
- Fixed two minor bugs in summary.sento_measures(); the
first one prevented printing of document-level weighting schemes, the
second one did not removeNAs when averaging over
correlations.
- Small bug fix in yearly aggregation (it did not account for the fact
that 1970-01-01is considered day zero).
- Dropped horizontal 0-line automatically added in the
plot.sento_measures()function as it distorts graphs of
time series with values far away from zero.
- Stopped exporting all defunct functions to clean up namespace.
- The function print.sento_corpus()now shows when corpus
is multi-lingual.
sentometrics 0.8.1
- Alignment with released quantedav2.0.
sentometrics 0.8.0
- New function: print.sento_corpus().
- Package update followed by release of a substantial update of the
vignette (see https://doi.org/10.2139/ssrn.3067734).
- Changed some warning()calls tomessage()calls to be more kind to the user.
- Altered internal code to comply with the corpusobject
fromquanteda>= v2.0.
- Dropped all "TF"-inspired weights for within-document
aggregation except for"TFIDF", and made this option return
the same sentiment scores as would when using thequantedapackage (see the example on
https://sentometrics-research.com/sentometrics/articles/examples/sentiment.html).
sentometrics 0.7.6
- Fixed memory allocation issue in the
compute_sentiment()function.
sentometrics 0.7.5
- New functions: as.data.table.sento_corpus(),as.data.frame.sento_corpus(), andas.data.frame.sento_measures().
- Embedded a small workaround in plot.attributions()to
guaranty same plotting behaviour after update ofggplot2package that gave buggy output for
thegeom_area()layer.
- Integrated for overall consistency measures_global()into theaggregate.sento_measures()function, adding ado.globalargument to enact it.
- Slightly changed the clusters-based sentence-level sentiment
computation (different weighting of adversative conjunctions).
- Clarified the documentation for the peakdates()andpeakdocs()functions.
- Put the Shiny application made available in previous package update
(i.e., the sento_app()function) in a separate sole-purpose
packagesentometrics.app(see
https://github.com/sborms/sentometrics.app).
- Moved the data.tablepackage from
Depends to Imports (see
https://github.com/Rdatatable/data.table/issues/3076).
- No change by reference of input sentiment objects in the
merge.sentiment()function anymore, and modified the
merging to give for instance a simple column binding of sentment methods
when all else is equal.
- Correct pass-through of default howargument in thecompute_sentiment()function.
- Added a few adversative conjunctions to all word lists in
list_valence_shifters.
- Added a do.normalizeoption to theweights_beta()andweights_exponential()functions.
- Added a do.inverseoption to theweights_exponential()function and associateddo.inverseExpargument in thectr_agg()function.
- Modified some names of options for within-document or
within-sentence aggregation (i.e., across tokens):
"squareRootCounts"into"proportionalSquareRoot","invertedExponential"into"inverseExponential", and"invertedUShaped"into"inverseUShaped".
- Corrected the numerator (number of documents or sentences instead of
token frequency) in all weighting schemes involving the inverse document
frequency (IDF).
- Aligned all formulas concerning the exponential weighting
curves.
- The compute_sentiment()function now also can do a
sentence-level calculation using the bigrams valence shifting
approach.
- Fixed a small bug that did not allow to have different valence
shifters lists for a multi-language sentiment calculation.
sentometrics 0.7.0
- New functions: measures_update(),subset.sento_measures(),as.sentiment(),as.sento_measures(),as.data.table.sentiment(),corpus_summarize(),sento_app(), andaggregate.sento_measures().
- Defunct all deprecated functions as well as the functions replaced
by the new functions (wiping the slate clean…).
- Handled reverse dependency issue raised by
quantedadevelopers regarding their new
corpus object.
- Renamed the class objects coming from any sento_xyz()function into the name of the function (e.g., thesento_measures()function now gives asento_measuresobject instead of asentomeasuresobject).
- Fixed a small bug in the aggregate.sento_measures()(previouslymeasures_merge()) function to take the mean
instead of the sum in a particular case.
- Added many more within- and across-document weighting schemes (see
the get_hows()function for an overview).
- Added the flexibility to do an explicit sentence-by-sentence
sentiment computation (see do.sentenceargument in thecompute_sentiment()function).
- Added the flexibility to create a multi-language
sento_corpusobject to do a multi-language sentiment
computation (applying different lexicons to texts written in different
languages).
- Expanded the compute_sentiment()function to also taketmSimpleCorpusandVCorpusobjects.
- Added the tmandNLPpackages to Suggests.
sentometrics 0.5.6
- New function: peakdates().
- Modified the purpose of the peakdocs()function and
added apeakdates()function to properly handle the entire
functionality of extracting peaks.
- A series of documentation fixes.
sentometrics 0.5.5
- New functions: sentiment_bind(), andto_sentiment().
- Defined replacement (of lexicons and names) for a
sentolexiconsobject.
- Properly handled lag = 1in thectr_agg()function, and set weights to 1 by default forn = 1in theweights_beta()function.
- Solved single failing test for older R version (3.4.4).
- Removed the abindpackage from
Imports.
- Removed the zoopackage from Imports,
by replacing the single occurrence of thezoo::na.locf()function by thefill_NAs()helper function (written inRcpp).
- Extended the quanteda::docvars()replacement method to
asentocorpusobject.
- Modified information criterion estimators for edge cases to avoid
them turning negative.
- Dropped the "x"output element from asentomodelobject (for large samples, this became too
memory consuming).
- Dropped the "howWithin"output element from asentomeasuresobject, and simplified asentimentobject into adata.tabledirectly
instead of alist.
- Expanded the do.shrinkage.xargument in thectr_model()function to a vector argument.
- Added a do.lagsargument to theattributions()function, to be able to circumvent the most
time-consuming part of the computation .
- Imposed a check in the sento_measures()function on the
uniqueness of the names within and across the lexicons, features and
time weighting schemes.
- Solved a bug in the measures_merge()function that made
full merging not possible.
- The nargument in thepeakdocs()function
can now also be specified as a quantile.
sentometrics 0.5.1
- Minor modifications to resolve few CRAN issues.
- Set default value of nCoreargument in thecompute_sentiment()andctr_agg()functions to
1.
- Classed the output of the
compute_sentiment.sentocorpus()function as asentimentobject, and modified theaggregate()function toaggregate.sentiment().
sentometrics 0.5.0
- New functions: weights_beta(),get_dates(),get_dimensions(),get_measures(), andget_loss_data().
- Renamed following functions: to_global()tomeasures_global(),perform_agg()toaggregate(),almons()toweights_almon(),exponentials()toweights_exponential(),setup_lexicons()tosento_lexicons(),retrieve_attributions()toattributions(),plot_attributions()toplot.attributions().
- Defunct the ctr_merge()function, so that all merge
parameters have to be passed on directly to themeasures_merge()function.
- Expanded the use of the centerandscalearguments in thescale()function.
- Added the dateBeforeanddateAfterarguments to themeasures_fill()function, and droppedNAoption of itsfillargument.
- Added a "beta"time aggregation option (see associatedweights_beta()function).
- Corrected update of "attribWeights"element of outputsentomeasuresobject in requiredmeasures_xyz()functions.
- Added a new attribution dimension ("lags") to theattributions()function, and corrected some edge
cases.
- Made a slight correction to the information criterion
estimators.
- Added a lambdasargument to thectr_model()function, directly passed on to theglmnet::glmnet()function if used.
- Omitted do.combineargument inmeasures_delete()andmeasures_select()functions to simplify.
- Expanded set of unit tests, included a coverage badge, and added
covrto Suggests.
- Reimplementation (and improved documentation) of the sentiment
calculation in the compute_sentiment()function, by writing
part of the code inRcpprelying onRcppParallel(added to Imports); there are
now three approaches to computing sentiment (unigrams, bigrams and
clusters).
- Replaced the dfmargument in thecompute_sentiment()andctr_agg()functions by
atokens. argument, and altered the input and behaviour of
thenCoreargument in these same two functions.
- Switched from the quantedapackage to
thestringipackage for more direct
tokenization.
- Trimmed the list_lexiconsandlist_valence_shiftersbuilt-in word lists by keeping only
unigrams, and included same trimming procedure in thesento_lexicons()function.
- Added a column type "t"to thelist_valence_shiftersbuilt-in word list, and reset values
of the"y"column from 2 to 1.8 and from 0.5 to 0.2.
- Updated the epubuilt-in dataset with the newest
available series, up to July 2018.
- Corrected the word ‘sparesly’ to ‘sparsely’ in
list_valence_shifters[["en"]].
- Further shortened project page to the bare essence.
- Omitted statement printed (‘Compute sentiment… Done.’) in the
compute_sentiment()function.
- Slightly modified print()generic for asentomeasuresobject.
- Dropped the "tf-idf"option for within-document
aggregation in thectr_agg()function.
- The sento_lexicons()function outputs asentolexiconsobject, which thecompute_sentiment(). function specifically requires as an
input; asentolexiconsobject also includes a"["class-preserving extractor function.
- The attributions()function outputs anattributionsobject; theplot_attribtutions()function is therefore replaced by theplot()generic.
- Defunct the perform_MCS()function, but the output of
theget_loss_data()function can easily be used as an input
to theMCSprocedure()function from theMCSpackage (discarded from Imports).
- Moved the parallelanddoParallelpackages to Suggests, as only
needed (if enacted) in thesento_model()function.
- Sligthly modified appearance of plotting functions, to drop
ggthemesfrom Imports.
sentometrics 0.4.0
- New functions: measures_delete(),nmeasures(),nobs(), andto_sentocorpus().
- Renamed following functions: any xyz_measures()tomeasures_xyz(),extract_peakdocs()topeakdocs().
- Dropped do.normalizeAlmargument in thectr_agg()function (but kept in thealmons()function).
- Inverted order of rows in output of the almons()function to be consistent with Ardia et al. (IJF, 2019) paper.
- Renamed lexiconstolist_lexicons, andvalencetolist_valence_shifters.
- The statselement of asentomeasuresobject is now also updated inmeasures_fill().
- Changed "_eng"to"_en"’ inlist_lexiconsandlist_valence_shiftersobjects, to be in accordance with two-letter ISO language naming.
- Changed "valence_language"naming to"language"inlist_valence_shiftersobject.
- The compute_sentiment()function now also accepts aquantedacorpusobject and acharactervector.
- The add_features()function now also accepts aquantedacorpusobject.
- Added an nCoreargument to thecompute_sentiment(),ctr_agg(), andctr_model()functions to allow for (more straightforward)
parallelized computations, and omitted thedo.parallelargument in thectr_model()function.
- Added a do.differenceargument to thectr_model()function and expanded the use of the already
existingoosargument.
- Brought ggplot2andforeachto Imports.
sentometrics 0.3.5
- Faster to_global().
- Set tolower = FALSEofquanteda::dfm()constructor incompute_sentiment().
- Changed interceptargument inctr_model()todo.interceptfor consistency.
- Proper checks on values of feature columns in
sento_corpus()andadd_features().
sentometrics 0.3.0
- New functions: diff(),extract_peakdocs(),
andsubset_measures().
- Modified R Depends from 3.4.2 to 3.3.0, and omitted import of
sentimentr.
- Word count per document now determined based on a separate
tokenization.
- Improved valence shifters search (modified
incluce_valence()helper function).
- New option added for within-document aggregation
("proportionalPol").
- Now correct pass-through of dfmargument inctr_agg().
- Simplified select_measures(), buttoSelectargument expanded.
- Calculation in to_global()changed (see vignette).
- Improved add_features(): regex and non-binary (between
0 and 1) allowed.
- All texts and lexicons now automatically to lowercase for sentiment
calculation.
- (Re)translation of built-in lexicons and valence word lists.
- Small documentation clarifications and fixes.
- New vignette and run_vignette.R script.
- Shortened project page (no code example anymore).
sentometrics 0.2.0
sentometrics 0.1.0
- Google Summer of Code 2017 “release” (unstable).