A word in a language that is adopted from another language.
What is borrowing or a loanword?
The name of the most well-known specific method to train global word embeddings.
What is word2vec or specifically skipgram?
A type of transfer learning where the parameters of the model are not changed, but the model sees a few examples of the task.
What is few-shot learning?
The general method of automatically or manually attaching labels to natural language data.
What is annotation?
A resource with one entry per general language word with all its senses, grammatical information, etc. in alphabetical order.
What is a dictionary?
The phenomenon where a person switches between two or more languages within the same conversation or even sentence.
What is code-switching?
Word embeddings that are available in more than one language, but equivalent words are not in the same vicinity in vector space.
What are multilingual word embeddings?
The general name of the method of training a language model that can then be used for transfer learning.
What is pretraining?
The name of the task to obtain domain-specific words from natural language text.
What is term extraction?
A resource with domain-specific terms, their equivalent terms in other languages, and relations between entries.
What is a terminology?
Content words in the core vocabulary of languages that derive from the same original or ancestral language.
What are cognates?
A vector space in which words across languages are in a similar vicinity if their meaning is similar.
What are crosslingual embeddings?
The specific type of pretrained language model variant (architecture) that the Generative Pretrained Transformers (GPTs) use, but not the BERT models.
What is a decoder-only model?
The task of extracting multiple types of information on a specific situation from natural language text, e.g. news.
What is event extraction?
Collection of a limited set of predefined words, mostly in one language, for specific communication settings.
What is a controlled vocabulary?
A language variety where languages are mixed but there are no first language speakers of the mix.
What is a pidgin?
The representation of entities and their relations in vector space.
What are knowledge graph embeddings?
The general method of adapting a pretrained language model to a downstream task, adapting all of its parameters.
What is fine-tuning?
The type of lexico-semantic relation that links the general category to a more specific instance of this cagetory, e.g. car-vehicle.
What is hypernymy?
A list of words usually with next to each word the corresponding word in another language.
What is a glossary?
The two main types of factors that drive language change according to the typologies of contact-induced language change.
What are social and linguistic factors?
The method of aligning trained, existing monolingual embeddings across languages by means of specifically learned matrices.
What is projection-based alignment?
The specific method for adapting a pretrained language model for a downstream task, where only a small proportion of the parameters are adapated.
What is Low-Rank Adaptation (LoRA)?
A specific method to artificially create more data by modifying existing data that can be used across languages.
What is data augmentation?
A collection of articles providing summaries of knowledge, including historical details, either in general or on a particular field.
What is an encyclopedia?