Page 19 - Ai Robogenius
P. 19

Some disadvantages of Primary data are as follows:

                 Often time-consuming and expensive.
                 Requires careful planning and design to avoid biases.
                 Secondary source of data

                 Secondary  data  is data that  has already  been  collected,  processed  and published  by  other
                 individuals or organisations. Researchers use this data for new analysis or to supplement primary
                 data.
                 Examples of secondary data sources include:

                 Books and academic journals: Published research studies and scholarly articles.

                 Government reports: Census data, economic statistics, health records and official publications.
                 Websites and online databases: Public datasets, industry reports, open data portals like Kaggle
                 or WHO.

                 Newspapers and magazines: Articles and reports on current events and trends.

                                                                                             •   UCI is a collection
                                           •   Countries like                                   of databases,
                                              Australia, EU, India,                             domain theories and
                                              New Zealand and                                   data generators in
                                              Singapore are openly
                                              sharing datasets on    Dataset Search             collaboration with
                                                                                                the University of
                                              various portals
                                                                                                Massachusetts


                 •   Kaggle is an online                            •   This is a toolbox by
                    community of data              .gov                Google that can       UCI
                    scientists where you                               search for data by
                    can access different       datasets                name                   Machine Learning Repository
                    types of data



                 Some advantages of secondary source data are as follows:

                 Easily accessible and often free or low-cost.

                 Saves time as data collection is already done.
                 Useful for historical or trend analysis.

                 Some disadvantages of secondary source data are as follows:

                 May not perfectly fit the current research question.
                 Possible issues with accuracy, reliability or outdatedness.

                 Lack of control over data quality and collection methods.
                 A list of government websites for data collection:








                                                                                 Stages of AI Project Cycle
                                                                                                                    17
   14   15   16   17   18   19   20   21   22   23   24