The first step to build a basic probabilistic spellchecker should be to ___.
count misspellings in all available texts
What is the difference between an abjad and an abugida?
Abjads represent consonants while abugidas represent syllables
Hoow mnay isolated ssspelling errors are in this questions?
3
A graph used to display and measure spoken language is called a...?
Spectrograph
The earliest example of a dialogue system
ELIZA
What does the following equation represent?
P(B|A) = (P(A and B))/(P(A))
Conditional Probability
What writing system is used for Chinese?
Logographic
Which probabilistic algorithm for analysis or generation is the basis of most style checkers, language language models (LLM) such as ChatGPT, and predictive text apps?
n-grams
What are the three articulatory features of a consonant sound?
Manner of articulation, voicing, place of articulation
(T/F) A collection of stories is an example of unstructured or semi-structured data.
True
What type of language model was the following sentence most likely generated by:
Hill late speaks; or! he a more you to leg first less enter
Unigram
What are the TWO basic types of writing systems?
meaning- or word-based (logographic)
sound- or letter-based
Assume that someone types hte when they meant to type the. What type of string edit operation does this exemplify?
transposition
The rise and fall in a speaker's pitch (frequency) is called...?
Intonation
The four Gricean maxims, or rules, of conversation are based on what principle?
The cooperative principle: the speakers are trying to cooperate in the conversation.
What does the following equation represent?
P(A|B) = (P(A and B))/(P(B))
Contextual Probability
How many possible unique characters can ASCII encode?
128
Define "parsing" in the context of grammar checkers?
Determining or annotating the structure of a sentence, usually through hierarchical "trees"
When speech samples are converted into measurable units, it is known as..?
acoustic signal processing
A chatbot is different from a voice assistant in that it is...
usually a limited-domain dialog system that is not designed to accomplish a particular task
Calculate the transition of the letter bigram: wi. If the first letter w (the context) appears 30 times, and i (the phenomena of interest) appears 6 times following a w, what is p(i|w)?
p(i|w) = 6/30 = 0.2
What standard is used to encode (American) English letters plus some punctuation and control codes?
American Standard Code for Information Exchange (ASCII)
What are most useful for probabilistic ranking of spelling suggestions?
transition matrix and confusion matrix
Speech sounds have properties that can be described with acoustic (sound waves) features and articulatory (manipulation of tongue, etc) features. Which type of features are more easily quantified/measured for the computer to process?
Acoustic features
(T/F) When someone makes a request to turn off the lights by saying "Can you find the lightswitch?" this is an uncooperative violation of the Maxim of Manner.
False