Data Analysis
Impact of Computing
Fall Semester Vocab
Spring Semester Vocab
Blast From the Past
100

What is big data

Big data refers to the large, diverse sets of information that grow at ever-increasing rates

100

Unequal access to the internet

Digital Divide

100

Bit vs Byte

  • Binary: A way of representing information using only two options.

  • Bit: A contraction of "Binary Digit";  the single unit of information in a computer, typically represented as a 0 or 1

  • Byte:  8 bits

100

Variable vs List

  • Variable: a named reference to a value that can be used repeatedly throughout a program.

  • List: an ordered collection of elements


100

How do the World Wide Web and the Internet work together?


200

What is a computing innovation and what are the 3 categories of computing innovations?

A computing innovation is an innovation that includes a computer or program code as an integral part of its functionality.

-physical (GPS)

-Nonphysical computing software (app)

-Nonphysical computing concept (social networking)

200

What is Moore's Law?





Definition. Moore's law is a term used to refer to the observation made by the late Gordon Moore in 1965 that the number of transistors in a dense integrated circuit (IC) doubles about every two years.


200

Packet vs Bandwidth

  • Packet: A chunk of data sent over a network. Larger messages are divided into packets that may arrive at the destination in order, out-of-order, or not at all. 

  • Bandwidth: the maximum amount of data that can be sent in a fixed amount of time, usually measured in bits per second. 


200

Heuristic solution

  • A heuristic solution

    Heuristics are mental shortcuts for solving problems in a quick way that delivers a result that is sufficient enough to be useful given time constraints.

200

wordList is a list of words that currently contains the values ["tree", "rock", "air"]

Which of the following lines will result in the list containing the values ["air", "rock", "air"]

wordList[0] = wordList[2]

300

What is the relationship between Machine Learning and Neural Networks?

Machine learning (ML) is a type of algorithm that automatically improves itself based on experience, not by a programmer writing a better algorithm. 

Computer programmers don't actually program each neuron. Instead, they train a neural network using a massive amount of labeled data.

300

What is the difference between Crowdfunding and Crowdsourcing?


Crowdfunding refers to raising funds for a project or venture from a large group of people, usually through online platforms, while crowdsourcing refers to obtaining ideas, solutions, or services from a large and diverse group of people, also often through online communities.


300

What is the relationship between Redundancy and Fault Tolerance?

  • Redundancy: the inclusion of extra components so that a system can continue to work even if individual components fail, for example by  having more than one path between any two connected devices in a network.

  • Fault Tolerant: Can continue to function even in the event of individual component failures. This is important because elements of complex systems like a computer network fail at unexpected times, often in groups.

300

Defined function vs called function vs function with a parameter 

  • Function: a named group of programming instructions. Also referred to as a “procedure”.

  • Function Call: a command that executes the code within a function

  • Parameter: a variable in a function definition. Used as a placeholder for values that will be passed through the function. 


300

How are procedures a form of abstraction?

Can be used multiple times without exactly knowing how it works

400

On June 22, 1944, the U.S. introduced the G.I. Bill, a law that provided many benefits to war veterans, including college tuition.

Cornell University has been tracking enrollment numbers since their inception. This table shows enrollment in the 10-year period from 1940-1950, broken down by gender:

Which hypothesis is most consistent with the data?

The G.I. Bill led to a large increase in male enrollment.

This is the hypothesis that's most supported by the data.


400

What is a symmetric key used for?

Encrypting and decrypting data 

400

3 types of algorithms 

  • Algorithm: a finite set of instructions that accomplish a task.

  • Sequencing: putting steps in an order.

  • Selection: deciding which steps to do next.

  • Iteration: doing some steps over and over

400

Cleaning Data vs Data Filtering

Cleaning Data: a process that makes the data uniform without changing its meaning (e.g., replacing all equivalent abbreviations, spellings, and capitalizations with the same word). 

Data filtering: choosing a smaller subset of a data set to use for analysis, for example by eliminating / keeping only certain rows in a table 


400

A movie can be played in a movie theatre and a cell phone device.  Is this various displaying of this video an example of lossy or lossless compression? Why?

Lossless; video is being compressed to fit into a smaller screen but still shows the same data

500

Brittany is using machine learning for an algorithm that classifies social media posts according to their sentiment ("positive", "negative", or "neutral").

She trains a neural network on a large open database of social media posts and tests the network on her personal social media feed. She notices that it's mis-classifying the posts from her teenage friends, who use different slang from her other friends.

What's the best way that Brittany can improve the machine learning algorithm's ability to classify posts from teenagers?

She can add social media posts from teenagers into the training data set, both from her own network and globally available data.


Machine learning algorithms can fail to work on real-world data when their training data set does not fully match the diversity of the real world. In a case like this, the training data set needs to be augmented with the type of input that it's missing.


500

An advertisement company builds a profile of a user based on their browsing history across many websites and uses that profile to create more targeted advertisements.

Which technology enables the company to aggregate the user's browsing history across multiple sites?

Cookies

500

Difference between a Computing System and a Computing Network?

  • Computing System: a group of computing devices and programs working together for a common purpose

  • Computing Network: a group of interconnected computing devices capable of sending or receiving data. 

500
  • Symmetric Key Encryption vs Public Key Encryption

  • Symmetric Key Encryption: involves one key for both encryption and decryption.

Public Key Encryption: pairs a public key for encryption and a private key for decryption. The sender does not need the receiver’s private key to encrypt a message, but the receiver’s private key is required to decrypt the message

500

An original file is 250 MB, it is compressed to 50 MB (including dictionary). What is its compression rate?

80%