A word in a language that is adopted from another language.
What is borrowing or a loanword?
The name of the most well-known specific method to train global word embeddings.
What is word2vec or specifically skipgram?
A type of transfer learning where the parameters of the model are not changed, but the model sees a few examples of the task.
What is few-shot learning?
The general method of automatically or manually attaching labels to natural language data.
What is annotation?
A resource with one entry per general language word with all its senses, grammatical information, etc. in alphabetical order.
What is a dictionary?
The phenomenon where a person switches between two or more languages within the same conversation or even sentence.
What is code-switching?
Word embeddings that are available in more than one language, but equivalent words are not in the same vicinity in vector space.
What are multilingual word embeddings?
The general name of the method of training a language model that can then be used for transfer learning.
What is pretraining?
The name of the task to obtain domain-specific words from natural language text.
What is term extraction?
A resource with domain-specific terms, their equivalent terms in other languages, and relations between entries.
What is a terminology?
A language variety where languages are mixed but there are no first language speakers of the mix.
What is a pidgin?
A vector space in which words across languages are in a similar vicinity if their meaning is similar.
What are crosslingual embeddings?
The specific type of pretrained language model variant (architecture) that the Generative Pretrained Transformers (GPTs) use, but not the BERT models.
What is a decoder-only model?
The task of extracting multiple types of information on a specific situation from natural language text, e.g. news.
What is event extraction?
Collection of a limited set of predefined words, mostly in one language, for specific communication settings.
What is a controlled vocabulary?
Content words in the core vocabulary of languages that derive from the same original or ancestral language.
What are cognates?
The representation of entities and their relations in vector space.
What are knowledge graph embeddings?
The general method of adapting a pretrained language model to a downstream task, adapting all of its parameters.
What is fine-tuning?
The type of lexico-semantic relation that links the general category to a more specific instance of this cagetory, e.g. car-vehicle.
What is hypernymy?
A controlled vocabulary in a usually bilingual list of corresponding words.
What is a glossary?
A speaker switches the language in a conversation from one sentence to the next, but not within sentences.
What is inter-sentential code-switching?
The process of aligning trained, existing monolingual embeddings across languages by means of specifically learned matrices.
What is projection-based alignment?
The specific method for adapting a pretrained language model for a downstream task, where only a small proportion of the parameters are adapated.
What is Low-Rank Adaptation (LoRA)?
A specific method to artificially create more data by modifying existing data that can be used across languages.
What is data augmentation?
A collection of articles providing summaries of knowledge, including historical details, either in general or on a particular field.
What is an encyclopedia?