Page 181 - Touhpad Ai
P. 181
u Reducing storage and processing costs
u Improving model accuracy by minimising noise
u Enabling better visualization and interpretation
For example, in facial recognition, thousands of pixels describe an image. Dimensionality reduction helps identify the
most important features like eyes, nose, and mouth — reducing complexity without losing essential information.
Two widely used methods for dimensionality reduction are:
u Principal Component Analysis (PCA): It transforms data into a new coordinate system where the largest variance
lies along the first axis (principal component). It captures the most important patterns in fewer dimensions.
u and Linear Discriminant Analysis (LDA). It focuses on maximising class separability, commonly used in classification
tasks.
Both techniques help in visualising high-dimensional data on 2D or 3D plots, making it easier to interpret. Dimensionality
reduction not only enhances visualization but also improves machine learning model performance by removing noise
and redundancy.
Multi-dimensional Data Representation and Visualization
Multi-dimensional data visualization is the use of graphs and charts to represent datasets that have more than three
features. It helps us understand patterns, relationships, and groupings in complex information that is difficult to see in
simple tables.
When data has many features or attributes, normal 2D or 3D graphs cannot show all the information clearly. Special
visualization techniques allow us to explore and understand complex data easily, making it simpler to identify trends,
compare features, and make informed decisions. These methods are widely used in science, business, finance, and
education to gain meaningful insights.
Common Graph Types for Multi-dimensional Data Visualization
Multi-dimensional data involves more than two or three variables, which makes it tricky to visualise using standard
charts. Specialised graphs help show relationships, patterns, and clusters in complex datasets. Some commonly used
types are:
u Scatter Plot Matrix (SPLOM): Displays pairwise relationships between multiple variables. Each cell shows a scatter
plot for two variables, making it easy to spot correlations.
u Bubble chart: Like a scatter plot but adds a third variable by varying the size of the bubbles. Helps visualise three
dimensions in one chart.
u Parallel Coordinates plot: Each axis represents a variable, and each line represents a data point across all
variables. Useful for detecting patterns and clusters in high-dimensional data.
u Radar (Spider) chart: Each axis radiates from a central point, representing one variable. Good for comparing
multiple entities across many attributes simultaneously.
u Pair plot: Explore relationships between multiple variables at once. A pair plot shows scatter plots for every pair of
numerical variables in a dataset.
u Heatmap: Represents data values with colour intensity in a matrix format. Useful for showing correlations or
intensity of variables.
u 3D Scatter plot: Plots three variables on three axes, sometimes adding colour or size for a fourth variable. Helpful
for exploring 3D relationships visually.
u Stacked/Grouped Bar charts: Show multiple categories together, helpful for comparing several variables at once.
u Treemaps: Hierarchical visualization where rectangles represent variables proportionally. Good for showing
proportions and nested relationships.
Data Visualization 179

