Page 254 - Artificial Intellegence_v2.0_Class_11
P. 254
Microsoft Excel spreadsheets are a common tool for manual data exploration and may be used to build simple charts
for data exploration, view raw data, and determine the association between variables.
Case, Variables, and Levels of Measurement
A case is an experimental unit. These are the people from whom the data is collected. When data is collected from
people, we sometimes call them participants or samples.
A variable is a characteristic that is measured and can have multiple values. In other words, something that varies from
case to case.
For example, a teacher wants to know if students of the sixth class who spend more time reading at home get better
grades on assignments and tests. In this situation, the students are the cases. There are three variables: time spent
reading at home, homework grades, and test scores.
We know that a dataset contains information about a sample. Hence, a dataset includes cases and cases are nothing but
a collection of objects. It must now also be clear that a variable is a characteristic that is measured and whose value can
keep changing during the program. On the other hand, a constant is a characteristic that is the same for all cases in a
study. In the above example, sixth class is a constant as all students are in the sixth class.
Let’s take another example, suppose, x is the distance travelled. In our program we say x = 15, this means that x is a
variable that stores the value 15. Now when we say that x = x + 10, the variable’s name is still x but its value has changed
to 25 due to the addition of a constant 10.
Levels of Measurement
The level of measurement refers to the relationship between the values given to the attributes of a variable. What does
that mean? Let’s start with a variable, which denotes "leader affiliation".
Leader Affiliation
Nelson Mandela Abraham Lincoln Mahatma Gandhi
1 2 3
This variable has multiple attributes. Suppose that in this particular context the only relevant attributes are "Nelson
Mandela", "Abraham Lincoln" and "Mahatma Gandhi". To analyse the results of these variables, we randomly assign
the values 1, 2, and 3 to the three attributes. The rating level describes the relationship between these three values. In
this case, we only use numbers as symbols for the longer text elements. We do not assume that higher numbers mean
"more" of something and lower numbers mean "less." We do not assume that the value 2 means that Abraham Lincoln
was twice the leader than Nelson Mandela. We don't assume Nelson Mandela comes first or has the highest priority just
because he has the value 1. In this case, we are just using the values as a shorter ‘name’ for the attribute. Here we would
call the measurement level "nominal".
When you know the level of measurement, you can decide how to interpret the data for these variables. If you know that
a measure is nominal (like the one just described), then you understand that the numeric values are just symbols for the
longer names. Secondly, knowing the level of measurement will help you decide which statistical analysis is appropriate
for the assigned values. When a measure is nominal, you know that you can never calculate the average of the data
values.
252 Touchpad Artificial Intelligence (Ver. 2.0)-XI

