Skip to main content

Posts

Showing posts from January, 2021

Building a frequency word cloud in R in other words what French government recommends in Corona's time

Word cloud is a text mining method to visualize textual data.  As a result we will see the most frequent used worlds in a text we are analysing .  The packages we are going to use are the following : tm which is text mining package, SnowballC which is text stemming package, word cloud which allow us to generate cloud image and RColorBrewer for choosing the colour  palettes.  You can install them first by using command install.packages(c("tm", "SnowballC", "wordcloud", "RColorBrewer"). This step is not necessarily  if by some reason you were using them before and there are already in your computer. To build a frequency word cloud I will use text which I found in French government  site about covid 19 and official recommendation the government  is giving. The URL of this site is https://www.gouvernement.fr/en/coronavirus-covid-19 Because of the fact that news about corona is changing rapidly and I don't know how long this information will be a...

Data.table package

  Data.table is an extremely fast and memory efficient package for transforming data. Many people use it while struggling a big dataset to save some time and memory space. The second class of data.table is data.frame which is a good news because it means that functions that work with data.frame also work with data.table. Data.table has sql like query commands. It looks like this:  dt[ i, j, by]  i= subset (rows) to be extracted based on a condition  j= calculation to be performed on the subset  by= grouping parameter that serves as a base for aggregation. Very often it is column or a vector.    Quering data with data.table  I will use again  mtcars dataset which is included in your base R program to present some queries using data.table . Mtcars has 32 observations on 11 (numeric) variables.  library(data.table)  dt=data.table(mtcars)  By using commends bellow we will check the class of dt and of its content:  class(dt)...