Page 279 - Data Science class 11
P. 279

Recap


               An essential aspect of data science includes data visualisation. You can represent such visualisations as scatter plots, box
               Ÿ
               plots, bar charts, histograms, pie charts, etc. You can also plot them by including a package named ggplot2. ggplot2 is a
               plotting package that simplifies the creation of complex plots from data in a data frame.
               For study of statistical data required in various plot types, R uses several built-in data sets.
               Ÿ
               A pie chart is a circular statistical graphic that is divided into slices to illustrate numerical proportions.
               Ÿ
               A bar chart is a pictorial representation of data that presents categorical data with rectangular bars whose heights or
               Ÿ
               lengths are proportional to the values that they represent.

               Ÿ  The stacked bar chart (aka stacked bar graph) extends the standard bar chart from looking at numeric values across one
               categorical variable to two.
               Line charts are usually used to identify trends in data. A line graph is a chart that is used to display information in the form
               Ÿ
               of a series of data points.
               A histogram is a graphical representation that organises a group of data points into user-specified ranges. The histogram,
               Ÿ
               which resembles a bar graph in appearance, condenses a data series into an easily interpreted visual by grouping many
               data points into logical ranges or bins.
               Ÿ  Scatter plots are dispersion graphs built to represent the data points of variables (generally two, but can also be three). The
               main use of a scatter plot in R is to visually check if there is some relation between numeric variables.
               Boxplots are a measure of how well distributed the data is in a data set. It divides the data set into three quartiles. This
               Ÿ
               graph represents the minimum, maximum, median, first quartile, and third quartile in the data set. It is also useful in
               comparing the distribution of data across data sets by drawing boxplots for each of them.
               he most common basic statistics terms are the mean, mode, and median. These are all what are known as "Measures of
               Ÿ
               Central Tendency." The most common distribution in statistical research is the normal distribution, sometimes called a bell
               curve.






                                                     Solved Exercise



                 Objective Type Questions (Section A)


            A.  Tick ( ) the correct option.
               1.  Select the most suitable answer of the following:
                  a.  ggplot2 is an R package used for statistical computing

                  b.  ggplot2 is dedicated to data Visualisation
                  c.  ggplot2 is a plotting package that simplifies the creation of complex plots from data in a data frame

                  d.  All of these
               2.  Which of the following is not true?
                  a.  A pie chart is a circular statistical graphic, which is divided into slices.

                  b.  Pie charts are not recommended in the R documentation, and their features are somewhat limited.
                  c.  R uses the function circular() to create pie charts.
                  d.  All of these




                                                           Coding for Data Science Visualisation using R-Studio  277
   274   275   276   277   278   279   280   281   282   283   284