Skip to main content

Data Management

This document covers data management topics for the DSE ICT examination, including data structures, database models, data integrity, security, privacy legislation, backup strategies, big data, and data ethics. Database design and SQL are covered in programming-and-databases.md.


Data Types and Structures

Primitive Data Types

Data TypeDescriptionSize (typical)Example Values
IntegerWhole numbers2--8 bytes-32768, 0, 42, 32767
Float/RealNumbers with fractional parts4--8 bytes3.14, -0.5, 2.718
BooleanLogical values1 byteTrue, False
CharacterA single symbol1--4 bytes'A', '7', '$'
StringSequence of charactersVariable"Hello World"
DateCalendar date4--8 bytes2026-01-15

Composite Data Structures

StructureDescriptionExample
ArrayFixed-size, ordered collection of elements of the same type[85, 92, 78, 95]
Record/StructCollection of related fields of possibly different types{name: "Chan", age: 17, score: 85}
2D ArrayTable of elements accessed by row and column indicesGrid for seating plan
Linked ListLinear collection where each element points to the nextDynamic data storage (beyond DSE scope)
StackLast-In-First-Out (LIFO) structureUndo operations, function call stack
QueueFirst-In-First-Out (FIFO) structurePrint queue, task scheduling

Arrays in Detail

An array stores multiple values of the same data type in contiguous memory locations, accessed by an index.

One-dimensional array:

Index01234
Value1025301540

Two-dimensional array (matrix):

Row\Col012
0123
1456
2789

Access: matrix[1][2] returns 6.


Database Models

Flat-File Database

A flat-file database stores all data in a single table (file), typically as a CSV or spreadsheet.

AdvantageDisadvantage
Simple to set up and useData redundancy (duplication)
Easy to understandData inconsistency (update anomalies)
Suitable for small dataLimited query capabilities
Low technical overheadPoor scalability
Portable (single file)No multi-user support

Example flat-file database (students.csv):

StudentID,Name,Class,Subject,Score
001,Chan Tai Man,5A,Maths,85
001,Chan Tai Man,5A,English,92
002,Lee Siu Ming,5B,Maths,72
002,Lee Siu Ming,5B,English,65

The student name and class are repeated for each subject -- this is data redundancy.

Relational Database

A relational database stores data in multiple related tables, linked by primary and foreign keys.

AdvantageDisadvantage
Minimal data redundancyMore complex to design and set up
Data integrity enforcedRequires knowledge of SQL and design
Powerful query capabilitiesHigher technical overhead
Multi-user concurrent accessMore expensive infrastructure
Scalable to large datasetsRequires a DBMS
Data independence

Example relational database:

Student (StudentID PK, Name, Class)

Subject (SubjectCode PK, SubjectName, Teacher)

Result (StudentID FK, SubjectCode FK, Score)

The student name is stored only once in the Student table, eliminating redundancy.

Hierarchical Database

Data is organised in a tree structure (parent-child relationships). Each parent can have multiple children, but each child has exactly one parent.

School
├── Department A
│ ├── Teacher 1
│ └── Teacher 2
├── Department B
│ ├── Teacher 3
│ └── Teacher 4
AdvantageDisadvantage
Intuitive for hierarchical dataComplex many-to-many relationships
Fast navigation (follow tree)No standard query language
Clear parent-child structureDeleting a parent deletes children

Network Database

An extension of the hierarchical model that allows many-to-many relationships through sets.

AdvantageDisadvantage
Supports many-to-manyVery complex to design
Flexible relationshipsNo universal standard
Efficient data accessDifficult to modify structure

Database Model Comparison

FeatureFlat FileHierarchicalNetworkRelational
StructureSingle tableTreeGraphMultiple tables
RedundancyHighMediumLowLow
RelationshipsNoneOne-to-manyMany-to-manyAny type
Query languageNoneNavigationNavigationSQL
FlexibilityLowMediumHighHigh
Design complexityVery lowMediumHighMedium
Modern usageSimple listsFile systemsRareMost common

Data Validation and Verification

Validation -- Ensuring Data is Reasonable

Validation checks that data entered meets specified rules before it is accepted into the system.

Validation TypeDescriptionExample
Presence checkEnsures a field is not emptyName cannot be blank
Type checkEnsures the data is of the correct typeAge must be an integer
Length checkEnsures data has the correct number of charactersPhone number must be exactly 8 digits
Range checkEnsures data falls within a specified rangeScore must be between 0 and 100
Format checkEnsures data follows a specific pattern (regular expression)Email must contain "@" and "."
Lookup checkCompares input against a list of valid valuesClass must be in the school's class list
Consistency checkCompares two fields to ensure they are logically consistentEnd date must be after start date
Limit checkA type of range check that sets a maximum valueQuantity ordered cannot exceed 999
Worked Example: Applying Validation to a Registration Form

A school registration form collects: Student Name, Date of Birth, Gender, Class, Email, Phone.

FieldValidation TypeRule
Student NamePresence checkCannot be empty
Student NameLength checkBetween 2 and 50 characters
DOBType checkMust be a valid date
DOBRange checkBetween 2008-01-01 and 2015-12-31
DOBConsistency checkCannot be in the future
GenderLookup checkMust be "M", "F", or "Other"
ClassLookup checkMust exist in the school's class list
EmailFormat checkMust contain "@" and at least one "."
PhoneFormat checkMust be exactly 8 digits (Hong Kong)
PhoneType checkMust be numeric

Verification -- Ensuring Data is Accurate

Verification checks that data entered matches the original source document.

MethodDescriptionWhen Used
Double entryData is entered twice by different operators; system compares the two entriesCritical data entry (exam marks, medical records)
Visual checkThe operator visually compares the entered data with the source documentGeneral data entry
Check digitA digit calculated from the other digits using a mathematical formula; appended to the dataProduct codes (ISBN, HKID), barcodes
Worked Example: Check Digit Calculation

ISBN-13 check digit:

The ISBN-13 for a book is 978-1-86197-876-?. Calculate the check digit.

  1. Take the first 12 digits: 9 7 8 1 8 6 1 9 7 8 7 6
  2. Multiply odd-position digits by 1 and even-position digits by 3:
Position123456789101112
Digit978186197876
Weight131313131313
Product92183818127724718
  1. Sum: 9+21+8+3+8+18+1+27+7+24+7+18=1519 + 21 + 8 + 3 + 8 + 18 + 1 + 27 + 7 + 24 + 7 + 18 = 151
  2. Check digit = (10(151mod10))mod10=(101)mod10=9(10 - (151 \mod 10)) \mod 10 = (10 - 1) \mod 10 = 9

Complete ISBN: 978-1-86197-876-9

To verify: sum all 13 digits with weights. The total should be divisible by 10.

Validation vs Verification

AspectValidationVerification
PurposeChecks data is reasonableChecks data matches the original source
WhenAt data entry timeAfter data entry
MethodRules, formats, rangesDouble entry, visual check, check digit
CatchesOut-of-range values, wrong types, missing dataTyping errors, transcription errors
ExampleScore must be between 0 and 100Entering 52 instead of 25 (visual check needed)

Data Security

Threats to Data Security

ThreatDescriptionImpact
Unauthorised accessSomeone gains access to data they should not seePrivacy breach, data theft
Data theftDeliberate copying or removal of dataFinancial loss, identity theft
Data corruptionData is modified or destroyed, accidentally or deliberatelyData integrity loss
MalwareViruses, ransomware, spyware affect dataEncryption, deletion, exfiltration
Insider threatsEmployees with access who misuse itIntentional or accidental data loss
Physical threatsFire, flood, theft, hardware failureComplete data loss
Human errorAccidental deletion, overwriting, misconfigurationData loss, downtime

Security Measures

MeasureDescriptionProtects Against
Access controlUsernames, passwords, biometric authenticationUnauthorised access
EncryptionConverting data to an unreadable format without the decryption keyData theft, interception
FirewallMonitors and filters network trafficNetwork intrusions
Antivirus softwareDetects and removes malwareViruses, trojans, ransomware
BackupRegular copies of data stored separatelyData loss from any cause
Audit logsRecords of who accessed what data and whenInsider threats, accountability
Physical securityLocks, security cameras, restricted access to server roomsPhysical theft, unauthorised access
User trainingEducating users on security best practicesPhishing, social engineering

Access Control Methods

MethodDescriptionSecurity Level
PasswordSecret string known only to the userLow-Medium
PINShort numeric code (e.g., 4--6 digits)Low
BiometricPhysical characteristics (fingerprint, face, iris)High
Smart cardPhysical card with embedded chip + PINHigh
Two-factor (2FA)Combination of two methods (e.g., password + phone code)Very High
Role-based (RBAC)Access based on job role (e.g., manager, clerk, viewer)Medium-High

Privacy Legislation

Hong Kong: Personal Data Privacy Ordinance (PDPO)

The PDPO (Cap. 486) is Hong Kong's primary data protection law, enforced by the Office of the Privacy Commissioner for Personal Data (PCPD).

Six Data Protection Principles (DPPs):

DPPPrincipleKey Requirement
1Purpose and collectionCollect data for a lawful purpose; inform the data subject of the purpose
2Accuracy and retentionKeep data accurate and up to date; do not keep longer than necessary
3UseUse data only for the stated purpose or a directly related purpose
4SecurityTake practical steps to protect against unauthorised access, processing, erasure, or loss
5Information opennessBe open about data handling policies and practices
6AccessAllow data subjects to access and correct their personal data

Key PDPO provisions for DSE exams:

  • Direct marketing requires opt-in consent from the data subject.
  • Data breach notification required for breaches affecting 500+ data subjects.
  • Data users must appoint a Data Protection Officer (DPO) in specified cases.
  • Offences carry penalties including fines (up to HKD 1,000,000) and imprisonment.

General Data Protection Regulation (GDPR) -- EU

FeaturePDPO (Hong Kong)GDPR (EU)
JurisdictionHong KongGlobal (EU residents)
Consent modelOpt-out for direct marketingOpt-in (explicit consent)
Data breachNotify PCPD (500+ affected)Notify within 72 hours
Right to erasureNot explicitYes ("right to be forgotten")
Data portabilityNot explicitYes
Maximum fineHKD 1,000,000 + imprisonmentEUR 20M or 4% global turnover

Personal Data (Privacy) Ordinance -- Exam Focus

Data subject rights under PDPO:

  1. Right to know whether personal data is held.
  2. Right to access a copy of the data.
  3. Right to request correction of inaccurate data.
  4. Right to opt out of direct marketing.

Data user obligations:

  1. Inform the data subject of the purpose of collection at or before collection.
  2. Use personal data only for the stated purpose.
  3. Take security measures to protect the data.
  4. Not retain data longer than necessary.
  5. Provide access to data upon request within 40 days.

Data Backup and Recovery

Backup Strategies

StrategyDescriptionRecovery PointCost
Full backupCopies all data every timeLast backupHigh
IncrementalCopies only data changed since the last backupLast backup + all incrementalsLow
DifferentialCopies all data changed since the last full backupLast full + last differentialMedium
SnapshotPoint-in-time copy of the entire system stateExact point of snapshotVariable

The 3-2-1 Backup Rule

A widely recommended backup strategy:

RuleRequirement
3Keep at least 3 copies of your data
2Store the copies on at least 2 different media types
1Keep at least 1 copy offsite (or in the cloud)

Example: Original on SSD + full backup on external HDD + incremental backup to cloud storage.

Backup Media

MediaCapacitySpeedCostDurability
External HDD1--20 TBFastLowModerate (5--10 yr)
SSD256 GB--4 TBVery fastMediumHigh (no moving parts)
Tape1--20 TBSlowLowVery high (30+ yr)
Cloud storageUnlimitedDepends on bandwidthSubscriptionDepends on provider
Optical disc25--100 GBSlowLowHigh (if stored properly)

Recovery Procedures

Recovery ScenarioProcedure
Accidental deletionRestore from the most recent backup
RansomwareRestore from offline backup (not connected during attack)
Hardware failureReplace hardware, restore from backup
Data corruptionIdentify point of corruption, restore from last clean backup
Natural disasterRestore from offsite/cloud backup on new hardware

Recovery Time Objective (RTO): The maximum acceptable time to restore the system after a failure.

Recovery Point Objective (RPO): The maximum acceptable amount of data loss measured in time. If RPO is 1 hour, backups must be taken at least every hour.


Big Data Concepts

What is Big Data?

Big data refers to datasets that are too large, complex, or fast-moving to be processed using traditional data processing methods.

The 5 V's of Big Data

VDescriptionExample
VolumeThe sheer scale of data (terabytes to petabytes and beyond)Social media generates TBs daily
VelocityThe speed at which data is generated and needs processingReal-time sensor data, stock feeds
VarietyThe diversity of data types (structured, semi-structured, unstructured)Text, images, video, audio, logs
VeracityThe uncertainty and trustworthiness of the dataInconsistent, incomplete, or biased data
ValueThe useful information and insights that can be extractedCustomer behaviour patterns, trends

Sources of Big Data

SourceData TypeVolumeExample
Social mediaText, images, videoVery highTwitter posts, Instagram photos
IoT sensorsNumeric, time-seriesHighSmart city sensors, weather stations
E-commerceTransactionalHighPurchase records, browsing behaviour
HealthcareMedical recordsMedium-HighPatient records, diagnostic images
Financial marketsTime-seriesHighStock prices, trading data
GovernmentVariousHighCensus data, tax records, traffic

Big Data Processing

ApproachDescriptionExample Tools
Batch processingProcess large datasets in scheduled batchesHadoop MapReduce
Real-timeProcess data as it arrives, with minimal latencyApache Spark Streaming, Apache Kafka
Data miningDiscover patterns and relationships in large datasetsMachine learning algorithms
Data visualisationPresent insights through interactive charts and dashboardsTableau, Power BI, D3.js

Implications of Big Data

AreaImplication
PrivacyCollecting vast amounts of personal data raises serious privacy concerns
SecurityLarger datasets are more attractive targets for attackers
EthicsUse of personal data for profiling, discrimination, manipulation
AccuracyBig data analysis can produce misleading results if data is biased
CostStorage, processing, and skilled personnel are expensive

Data Ethics

Key Ethical Principles

PrincipleDescription
Informed consentIndividuals should know what data is collected and how it is used
Purpose limitationData collected for one purpose should not be repurposed without consent
Data minimisationCollect only the minimum data necessary for the stated purpose
TransparencyOrganisations should be open about their data practices
AccountabilityOrganisations must be responsible for how they handle data
FairnessData should not be used to discriminate or exploit individuals
SecurityAdequate measures must protect data from breaches
AccuracyData should be kept accurate and up to date

Ethical Issues in Data Use

IssueDescriptionExample
ProfilingCreating detailed profiles of individuals from their dataTargeted advertising, credit scoring
DiscriminationUsing data to unfairly treat individuals or groupsInsurance premiums based on genetic data
SurveillanceMonitoring individuals' activities without their knowledgeWorkplace monitoring, CCTV tracking
Data ownershipWho owns the data -- the individual or the platform?Social media posts, search history
Algorithmic biasAutomated decisions that systematically disadvantage certain groupsHiring algorithms, criminal risk assessment
Consent fatigueUsers agree to terms without reading them due to volumeApp terms of service, cookie consent

Data Ethics in the Context of Hong Kong

The PDPO provides a legal framework for data protection in Hong Kong, but ethical considerations extend beyond legal compliance:

  • Organisations should collect only data necessary for their stated purpose.
  • Individuals should have meaningful control over their personal data.
  • Data analytics should be used to benefit individuals, not exploit them.
  • Anonymisation techniques should be used when possible to protect identities.
  • Regular audits should ensure data practices remain ethical and compliant.

Common Pitfalls

  1. Confusing validation and verification: Validation checks that data is reasonable (e.g., score between 0 and 100). Verification checks that data matches the original source (e.g., double entering marks). Both are necessary for data integrity.

  2. Flat-file vs relational: A flat-file database with repeated data (e.g., student names in every row) is not normalised and suffers from update anomalies. Relational databases solve this by storing each piece of data once and linking tables with foreign keys.

  3. PDPO is Hong Kong-specific: Do not confuse PDPO provisions with GDPR. PDPO does not explicitly include the "right to be forgotten" or data portability requirements. GDPR is stricter and has higher penalties.

  4. Backup is not the same as archiving: Backups are for disaster recovery (restore data if lost). Archives are for long-term retention (regulatory compliance, historical records). They serve different purposes.

  5. The 3-2-1 rule requires offsite: Having three copies on the same hard drive does not satisfy the 3-2-1 rule. At least one copy must be stored in a different physical location or in the cloud.

  6. Big data is not just about volume: Many students focus only on the size of data. The other V's (velocity, variety, veracity, value) are equally important. Big data is characterised by ALL five dimensions.

  7. Check digit is verification, not validation: A check digit verifies that data was entered correctly (matches the original). It does not validate that the data is reasonable (e.g., an ISBN can have a valid check digit but be the wrong book).

  8. Encryption protects data but does not prevent all threats: Encryption protects confidentiality but does not prevent data corruption, accidental deletion, or physical theft of the storage device. It must be combined with other security measures.

  9. Consent must be informed and specific: Vague or buried privacy policies do not constitute informed consent. Users must clearly understand what they are consenting to.

  10. Data minimisation is a legal requirement under PDPO: Collecting more personal data than necessary for the stated purpose violates DPP 1 (purpose and collection) and DPP 3 (use limitation).


Practice Problems

Question 1: Data Validation Design

A hotel booking system collects the following information:

  • Guest name
  • Check-in date
  • Check-out date
  • Room type (Standard, Deluxe, Suite)
  • Number of guests
  • Credit card number

For each field, state an appropriate validation type and the validation rule.

Answer:

FieldValidation TypeRule
Guest namePresence checkCannot be empty
Guest nameLength checkBetween 2 and 100 characters
Check-in dateType checkMust be a valid date
Check-in dateRange checkCannot be before today's date
Check-out dateType checkMust be a valid date
Check-out dateConsistency checkMust be after the check-in date
Room typeLookup checkMust be Standard, Deluxe, or Suite
Number of guestsRange checkBetween 1 and 4 (Standard), 1 and 6 (Deluxe/Suite)
Number of guestsConsistency checkMust match room type capacity
Credit cardLength checkMust be 16 digits
Credit cardType checkMust be numeric
Credit cardCheck digitMust pass Luhn algorithm validation
Question 2: Database Models Comparison

A school needs to manage student records, including personal details, enrolment in subjects, and term grades.

(a) Explain why a relational database is more suitable than a flat-file database for this purpose.

(b) Describe two advantages and one disadvantage of using a relational database compared to a hierarchical database for this scenario.

(c) Identify the tables, fields, primary keys, and foreign keys you would create for this system.

Answer:

(a) A relational database is more suitable because: (1) Student personal details (name, class, DOB) are the same across all subjects and terms. In a flat file, these would be repeated for every subject-term combination, causing data redundancy. (2) If a student's class changes, a flat file requires updating every row for that student, risking inconsistency. A relational database stores the student details once in a Student table and references them via foreign keys. (3) The relational model supports complex queries (e.g., "find all students who scored above 80 in both Maths and English") that would be extremely difficult with a flat file.

(b) Advantages over hierarchical: (1) Relational databases support many-to-many relationships (e.g., students to subjects), while hierarchical databases only support one-to-many. (2) Relational databases use SQL for flexible querying; hierarchical databases require navigation through the tree structure. Disadvantage: Relational databases are more complex to design, requiring knowledge of normalisation and SQL, while hierarchical databases have a simpler tree structure that is intuitive for naturally hierarchical data.

(c) Tables and keys:

  • Student (StudentID PK, Name, Class, DOB)
  • Subject (SubjectCode PK, SubjectName, Teacher)
  • Enrolment (StudentID FK, SubjectCode FK) -- composite PK
  • Grade (StudentID FK, SubjectCode FK, Term, Score) -- composite PK (StudentID + SubjectCode + Term)

Foreign keys: Enrolment.StudentID references Student.StudentID; Enrolment.SubjectCode references Subject.SubjectCode; Grade.StudentID references Student.StudentID; Grade.SubjectCode references Subject.SubjectCode.

Question 3: PDPO Scenario Analysis

A Hong Kong fitness centre collects members' personal data including name, phone number, email, date of birth, and fitness goals. They use this data to send monthly promotional emails about new classes and membership offers.

(a) State which DPPs are relevant to this scenario and explain how the fitness centre should comply.

(b) A member requests that their data be deleted. Under the PDPO, what is the fitness centre's obligation?

(c) The fitness centre shares member email addresses with a partner supplement company without informing members. Analyse whether this violates the PDPO.

Answer:

(a) DPP 1 (Purpose and collection): The fitness centre must inform members at the time of collection that their data will be used for promotional emails. The purpose must be lawful and specific. DPP 3 (Use): The data should only be used for the stated purpose (promotional emails about the fitness centre's own services). Using it for other purposes requires additional consent. DPP 4 (Security): The fitness centre must implement appropriate security measures (encryption, access controls) to protect member data. DPP 5 (Information openness): The fitness centre should have a privacy policy explaining their data practices. DPP 6 (Access): Members can request access to their data and corrections.

(b) Under DPP 6, the member has the right to access their personal data and request correction. However, the PDPO does not explicitly include a "right to erasure" (unlike the GDPR). The fitness centre should allow the member to access and correct their data. Whether they must delete the data depends on whether the data is still needed for the stated purpose. If the member has cancelled their membership and the data is no longer needed, DPP 2 (retention) suggests it should be deleted.

(c) Yes, this likely violates DPP 1 (purpose and collection) and DPP 3 (use). Members provided their email addresses for the fitness centre's own promotional purposes. Sharing with a third-party supplement company is a different purpose that was not stated at collection. Under DPP 3, personal data must only be used for the purpose stated at collection or a directly related purpose. Selling or sharing email addresses with an unrelated third party without consent violates this principle.

Question 4: Backup and Recovery

A small business stores its customer database, financial records, and inventory data on a single server in the office.

(a) Identify three threats to this data and for each, recommend a security measure.

(b) Describe how the business should implement the 3-2-1 backup rule.

(c) The server's hard drive fails completely. Describe the recovery procedure.

(d) Explain the difference between RTO and RPO, giving an appropriate example for this business.

Answer:

(a) Three threats and measures:

  1. Hardware failure (hard drive crash): Mitigation: RAID (redundant array of disks) for hardware-level redundancy, plus regular backups. RAID 1 mirrors data across two drives; if one fails, the other continues operating.
  2. Ransomware attack: Mitigation: Regular offline backups (not connected to the network during the attack), antivirus software, email filtering, staff training on phishing awareness.
  3. Fire or flood (physical disaster): Mitigation: Offsite backups (cloud storage or a physical backup stored at a different location), fire suppression system in the server room.

(b) 3-2-1 implementation: Keep 3 copies of all data. Store copies on 2 different media types (e.g., the server's RAID array as copy 1, an external HDD as copy 2, cloud storage as copy 3). Ensure 1 copy is offsite (the cloud backup satisfies this). Schedule daily incremental backups and weekly full backups. Test the backup restoration procedure regularly.

(c) Recovery procedure: (1) Replace the failed hard drive. (2) Install the operating system and database software on the new drive. (3) Restore the most recent full backup. (4) Apply all incremental backups since the last full backup. (5) Verify data integrity by checking record counts and sample records. (6) Resume normal operations. If the server itself is damaged, restore to a replacement server using the offsite/cloud backup.

(d) RTO (Recovery Time Objective): The maximum acceptable downtime. For a small business, this might be 4 hours -- the business can tolerate being without the system for up to 4 hours. RPO (Recovery Point Objective): The maximum acceptable data loss. If the business performs daily backups at midnight, the RPO is 24 hours -- in the worst case, up to 24 hours of data could be lost. To reduce RPO, the business could perform hourly incremental backups, reducing the maximum data loss to 1 hour.

Question 5: Big Data and Ethics

A social media company analyses user posts, likes, and browsing behaviour to build detailed user profiles. These profiles are used to show targeted advertisements.

(a) Explain how this scenario relates to each of the 5 V's of big data.

(b) Discuss two ethical concerns arising from this practice.

(c) Suggest two measures the company could take to address these ethical concerns.

Answer:

(a) Volume: The company processes billions of posts, likes, and interactions from millions of users worldwide. Velocity: Data is generated continuously in real time as users post, like, and browse. The company must process this data quickly to serve relevant ads. Variety: Data includes text (posts, comments), images, videos, timestamps, location data, and behavioural metadata -- a mix of structured, semi-structured, and unstructured data. Veracity: User data may be inaccurate (fake accounts, bots, misleading posts) or incomplete (users who rarely engage). The company must filter noise from genuine signals. Value: The analysed profiles generate significant advertising revenue by enabling precisely targeted marketing.

(b) Ethical concern 1: Informed consent and transparency. Users may not fully understand how their data is being collected, analysed, and used for profiling. Privacy policies are often long and complex, and users may not realise the extent of profiling.

Ethical concern 2: Discrimination and manipulation. Targeted advertising based on detailed profiles can be used to manipulate behaviour (e.g., showing job ads only to certain demographics) or exploit vulnerable users (e.g., targeting gambling ads at individuals with behavioural patterns suggesting addiction).

(c) Measure 1: Provide clear, concise privacy notices that explain in plain language what data is collected, how it is used for profiling, and how users can opt out of targeted advertising.

Measure 2: Implement data minimisation -- only collect and retain the data necessary for the stated purpose, and allow users to view, correct, and delete their profile data. Provide an easy-to-use privacy dashboard.