Page 310 - AI Ver 3.0 Class 11
P. 310

•  Statistical analysis is the process of collecting, exploring, and presenting huge volumes of data in order to
                     identify patterns and trends.
                    •  The term “central tendency” refers to a single number that summarises the complete distribution of a data
                     domain (or data set).
                    •  The mean, the most commonly used measure of central tendency, is the average value of collection of data
                     points.

                    •  The median is the middle value of a set of data, obtained by ranking all of the data points and selecting the
                     one in the centre.
                    •  Mode is used to find the distribution peak, and there can be multiple peaks.

                    •  Data points with a low standard deviation are near to the mean, whereas those with high standard deviation
                     show a wide range of values.
                    •  Data representation is defined as a technique for presenting enormous amount of data in a way that allows
                     user to quickly and easily interpret the most relevant information.
                    •  There are two broad categories of data representation techniques - Non-Graphical Technique and Graphical
                     Technique.
                    •  Data visualisation in Python can be accomplished using the Matplotlib library.
                    •  The  ‘pyplot’  submodule  of  Matplotlib  offers  offers  an  interface  like  MATLAB  and  includes  numerous
                     convenience functions that simplify the process of creating basic plots.
                    •  Data preprocessing is an essential phase in the machine learning process that prepares datasets for effective
                     machine learning applications. It includes multiple processes  to clean,  transform, reduce,  integrate, and
                     normalise data.
                    •  Once data preprocessing is complete, the dataset is divided into two sets: the Training dataset and the
                     Testing dataset.
                    •  The Training dataset is utilised to teach machine learning models, while the Testing dataset assesses how well
                     the trained models perform.
                    •  In today's era, having proficiency in handling data is crucial.




                                                           Exercise




                                                       Solved Questions


                                                SECTION A (Objective Type Questions)
                         uiz
              A.   Tick ( ) the correct option.
                   1.   Data can be described as a representation of        .
                        a.  Random information                         b.  Facts or instructions about entities
                        c.  Irrelevant details                         d.  Only numerical values

                   2.   Which of the following is NOT a primary data source?
                        a.  Surveys                                    b.  Interviews
                        c.  Observations                               d.  Published reports




                    308     Touchpad Artificial Intelligence (Ver. 3.0)-XI
   305   306   307   308   309   310   311   312   313   314   315