Data Types
Data Dilemmas
Data Security
Data Uses
Data Integrity
100

The traffic light I am driving toward has turned red. Which part of DIKW does this belong to?


Knowledge.

100

What are the two challenges associated with manual data entry that impact both data integrity and data reliability?

The proneness to error and accidental data deletion.

Manual data entry is prone to errors, and accidental deletions can compromise both data integrity and reliability.

100

A protocol developed for sending information securely over the Internet by using an encrypted link between a web server and a browser.

What is this?

Secure Socket Layer (SSL)

or Transport Layer Security (TLS)

100

Big Data is a term characterized by 4Vs. 

Please identify the 4Vs



Volume, velocity, variety, veracity

100

The count of students in a classroom.

The height of a person.

Which one is continuous data, discrete data?

Discrete: Count of students in a classroom


Continuous: Height of a person

200

Please list 2 characteristics of qualitative data other than being non-statistical and is typically unstructured or semi-structured.

  • Data is not measured using numbers to derive a conclusion

  • Based on data properties, attributes, labels etc. 

  • Used as a start for asking “why” questions

  • Deals with non-numerical data (concepts, descriptions, meanings, words etc.)

200

 why might relying on "facts" from two decades ago present challenges, and what impact does it have on decision-making?

Relying on outdated data poses challenges as it may not accurately reflect current conditions, trends, or circumstances. This can lead to uninformed decisions, misguided policies, and an inaccurate understanding of the present reality.

200

How does a degausser work to erase data?

 the degausser disrupts the magnetic structure that holds the data, effectively "scrambling" the information stored on the magnetic medium

200

Please provide two examples of real-world use of Big Data.

1. Healthcare Analytics: Analyzing electronic health records to predict disease risk factors and optimize treatment strategies.

2. Financial Fraud Detection: Monitoring credit card transactions to identify unusual spending patterns indicative of fraudulent activity.

3. Smart Cities Management: Analyzing traffic patterns, public transportation usage, and energy consumption data to enhance city planning and resource allocation.

200

What is the difference between a primary key and foreign key in a relational database? Why do we need them both?

Daily Double:

  • The primary key uniquely identifies records within its own table.

  • The foreign key establishes a link between tables by referencing the primary key of another table.

  • Together, they maintain data integrity and support relationships between tables in a relational database.
300

Please list 3 methods of quantitative data collection.

1. Surveys/Questionnaire: Surveys involve a set of questions presented to participants, who provide responses based on their opinions, experiences, or characteristics.

2. Experiments: Experimental research involves manipulating one or more variables to observe the effect on another variable. 

3. Causal-comparative observational Studies:  Quantitative data is obtained by counting occurrences, measuring durations, or recording numerical values during observations.

300

How can wearable device manufacturers build and maintain trust in the accuracy of health data? Provide three pros and three cons.

Daily Double

Pros:

  1. Transparent Data Handling: Manufacturers can build trust by being transparent about how health data is collected, processed, and shared, fostering user confidence.
  2. Regular Software Updates: Providing regular updates and security patches demonstrates a commitment to maintaining the integrity and security of the device and its data.
  3. Third-Party Audits: Independent audits by trusted third parties can validate the accuracy and security measures of health data, enhancing user trust.

Cons:

  1. Limited Control over Third-Party Apps: If the wearable device integrates with third-party apps, manufacturers may have limited control over how those apps handle health data, creating potential trust issues.
  2. Data Breach Risks: Despite efforts, the risk of data breaches always exists, and a single breach can significantly erode user trust in the security of health data.
  3. Changing Regulations: Evolving privacy regulations may require manufacturers to adapt their practices, and any perceived lag in compliance can impact user trust.
300

Please compare and contrast Symmetric key encryption and Public key (asymmetric) encryption

Give me 2 of each 

Symmetric: 

Fast but less secure for key distribution because using the same key

Efficient for bulk data encryption.

Asymmetric: 

Essential for secure communication, digital signatures, and key because there's a private and public key being used

More secure due to the use of key pairs.

Commonly used for secure communication, digital signatures, and key exchange.



300

Explain two different kinds of metadata and their uses with real-life examples.

Descriptive metadata describes the content of the data, making it easier to identify and locate.
Includes information like the title, author, and summary of a book.

Structural metadata outlines the relationships between different pieces of data and how they form a larger structure.
Identifies the chapters or timestamps, allowing users to navigate through the video in a video file

Administrative metadata provides information about the creation, maintenance, and usage of the data.
Digital image file, administrative metadata includes details about the camera settings, date of creation, and photographer's name.

Rights metadata specifies information about intellectual property rights, permissions, and restrictions associated with the data.
In a digital document, rights metadata indicates whether the document is under copyright and any usage restrictions.

300

What components together make up a relational database?

Tables of data make up of rows and columns

Fields, primary key, foreign key

There also needs to be a relationship established between the tables.

400

Once data has been collected, it must be processed from its raw format.

Which stage of data lifecycle does this belong to? 

What are the 3 methods that can be used during this stage?

Process/usage stage:

  1. Data Wrangling: Raw data cleaned and transformed into useful and accessible data

  2. Data Compression: Data transformed into a format that can be efficiently stored

  3. Data Encryption: Data transformed into another code to prevent easy access

400

When considering the ethics of a fitness tracking app collecting detailed personal data, what should users question and consider when using fitness tracking apps?

Users should question the extensive data collection practices of the fitness tracking app, which includes recording activities, locations, and health metrics. The security level of the company is also a concern.

The ethical concerns revolve around issues of privacy, consent, and the responsible use of sensitive personal information.

Users should debate whether the benefits provided by the app justify the extent of data collection and whether the company is transparent about its practices.

400

Discuss the significance of implementing data encryption in a business environment.

Daily Double

  • Protection of Sensitive Information: Encryption serves as a crucial safeguard for sensitive data, preventing unauthorized access and ensuring confidentiality.

  • Legal Compliance: Many industries are subject to data protection laws. By employing encryption, organizations fulfill legal requirements, avoiding penalties and maintaining compliance.

  • Preventing Data Breaches: In the event of a security breach, encrypted data remains unreadable, mitigating the impact and reducing the risk of unauthorized exploitation.

400

Please explain explain the steps of how deepfake is created with data.

1. Data Collection: Gather a substantial amount of video footage featuring the target individual

2. Training the Neural Network: The collected data is fed into a deep neural network, allows the algorithm to understand and mimic the subject's facial features, expressions, and vocal patterns.

3. Algorithm Integration with Graphics Technology: he trained learning algorithm is combined with computer graphics technologies to overlay AI-generated facial and vocal patterns onto real-time video footage.

400

Please explain the 3 methods of weakening Data Integrity.

Commission: Creating data from nothing (ex. Doing a lab, you did not actually do your lab, instead, you decided to just make up data for the lab result. Lying.)

Omission: Removing, reducing data, erasing data (Cherry-picking, fake news by mixing both legit and fake information together, propaganda etc.)

Manipulation: Changing the data, modifying the data to be more appealing (Gaslighting, editing an essay, photoshop, snapchat filter)



500

Evaluate the significance of data archiving/backup in the broader context of the data lifecycle, please provide one real-world example.

  • Ensuring Data Integrity: Backups play a crucial role in preserving data integrity by providing a means to recover information in case of corruption, accidental deletion, or other forms of data loss.

  • Disaster Recovery: In the event of a catastrophic failure, such as hardware malfunction or natural disasters, backups serve as a key component of disaster recovery strategies, facilitating the restoration of operations.

  • Long-Term Preservation: Data backups contribute to the long-term preservation of essential information, preventing loss due to evolving technologies, hardware upgrades, or software changes.

  • Supporting Regulatory Compliance: In industries governed by data protection regulations, maintaining regular backups is often a compliance requirement, ensuring that organizations can recover and protect sensitive information as mandated by law.

  • Facilitating Data Migration: During transitions to new systems or platforms, backups ease the process of data migration, enabling the transfer of information without compromising its integrity or risking loss.

500

To what extent should social media platforms be responsible for exposing users to diverse viewpoints?

Pros

Minimizes Echo Chambers: Social media platforms encouraging diverse content reduce the risk of users being confined to echo chambers

Mitigates Polarization: By promoting diverse viewpoints, social media platforms can contribute to reducing societal polarization. 

Cons

Technical Challenges: The algorithms that determine content visibility may unintentionally reinforce certain viewpoints

Concerns about Censorship: Implementing measures to expose users to diverse viewpoints could be perceived as a form of censorship. 

User Resistance: Some users may resist exposure to diverse viewpoints, leading to quitting the social media platform.

500

To what extent does Blockchain impact our digital society?

Pros:

  • Extremely secure, requires consensus from large group of people 

  • Transparent, once recorded, it is publicly shown to the world

  • It’s collaborative (p2p)

  • It’s unregulated by a centralized entity and you have a lot more freedom

Cons

  • Too secure and immutable so you cannot tamper with it

  • Creates digital divide because you need to invest in a GPU to start

  • Inefficient as it takes time to mine each block

  • Anonymous, malicious users can use it to their advantage

500

Explain the fundamental concepts of a blockchain. How does it achieve decentralization and data integrity? Provide a real-world example

  • Blockchain achieves decentralization through a network of nodes that reach consensus
  • Data integrity is ensured through cryptographic hashing and consensus mechanisms
  • Bitcoin blockchain ensures secure and transparent financial transactions
500

Please differentiate between Misinformation, Disinformation and Malinformation.

Misinformation:
Inaccurate but not necessarily intentional.
May cause confusion but not necessarily harm.
Unintentional spread.

Disinformation:
Deliberately false information.
Intended to mislead or manipulate, potentially causing harm.
Deliberate intent to deceive

Malinformation:
True information used for harmful purposes.
Intentionally harmful, targeting reputation or well-being.
Intent to cause harm using true information.