Page 247 - AI Ver 3.0 class 10_Flipbook
P. 247

True Positive
                         Recall  =
                                      True Positive + False Negative
                                         50
                               =
                                       50 + 12
                                       50
                               =
                                       62
                               =      0.806 or 80.6%
                    2.  A company is developing a model to predict whether a customer will default on a loan. If the dataset is not split
                       properly into training and testing sets, what issues might arise? How would you ensure fair evaluation?
                   Ans. If the dataset is not split properly, the model might overfit, learning patterns from the training data but failing to
                      generalize to new data. This leads to poor performance on real-world predictions. To ensure fair evaluation, the dataset
                      should be split into a training set (for learning patterns) and a testing set (for evaluating generalisation). Sometimes, a
                      validation set is also used for hyperparameter tuning.
                    3.  Identify which metric (Precision or Recall) is to be used in the following cases and why?
                       a.  Email Spam Detection

                       b.  Cancer Diagnosis
                       c.  Legal Cases(Innocent until proven guilty)
                       d.  Fraud Detection
                       e.  Safe Content Filtering (like Kids YouTube)
                   Ans. a.   Precision: In spam detection, you generally want to minimise the number of legitimate emails incorrectly marked
                         as spam (False Positives). Precision is important here because it focuses on the accuracy of the positive predictions,
                         ensuring that when the model predicts an email is spam, it truly is spam.
                      b.   Recall: In cancer diagnosis, it's crucial to identify all the true cancer cases, even if it means some healthy people are
                         wrongly identified as having cancer (False Positives). Recall is more important here because it focuses on identifying
                         all actual positive cases (patients with cancer), which is critical to avoid missing any true positive cases.
                      c.   Precision: In a legal setting, particularly with the principle of "innocent until proven guilty," you want to minimize the
                         number of innocent people wrongly convicted (False Positives). Precision is important here because it ensures that
                         when a legal case is predicted to be guilty, the defendant is actually guilty.
                      d.   Recall: In fraud detection, it's important to catch all instances of fraud, even at the risk of flagging some legitimate
                         transactions as fraudulent. Recall is prioritized because missing out on fraudulent transactions (False Negatives)
                         can have significant consequences, whereas false alarms (False Positives) can usually be dealt with through further
                         verification.
                      e.   Precision: For safe content filtering, you want to ensure that all flagged content is indeed inappropriate (False
                         Positives are more costly). Precision is key here because it focuses on making sure that when content is flagged as
                         unsafe, it truly is unsafe.
                 Assertion and Reasoning questions.
                 Direction: Questions 4-5, consist of two statements – Assertion (A) and Reasoning (R). Answer these questions by selecting
                 the appropriate option given below:
                 a.  Both A and R are correct and R is the correct explanation of A.
                 b.  Both A and R are correct but R is NOT the correct explanation of A.
                 c.  A is correct but R is incorrect.

                 d.  A is incorrect but R is correct.




                                                                                           Evaluating Models    245
   242   243   244   245   246   247   248   249   250   251   252