שאלות מהקוויזים בכיתה
לחיצה על 'לא נכון' תתן לכם שגיאה על השאלה, מה שיגרום לה להופיע שוב בהמשך.
מומלץ ללחוץ על הכפתור שמציג את הטקסט משמאל לימין מעל כל שאלה.
Explain the intuition behind distributional methods - that is, why do we believe that solving the task of predicting a word given its context yields embeddings which capture lexical semantics?
Describe what is the Continuous Bag of Words (CBoW) model for learning word embeddings. List two ways in which the task modeled by CBoW is different from learning an n-gram language model.
ADJ ADP ADV
CCONJ SCONJ
NOUN PROPN PRON
DET NUM
VERB AUX
PART PUNCT SYM INTJ X
What would be the perplexity of the task of POS given a uniform distribution over the universal POS tagset?
Give an example word for each of the following classes:
ADJ
AUX
PRON
CCONJ
ADV
*my addition*
think about each one from the classes:
ADJ ADP ADV
CCONJ SCONJ
NOUN PROPN PRON
DET NUM
VERB AUX
PART PUNCT SYM INTJ X
Tag the following sentence in the format "John/PROPN eats/VERB an/DET apple/NOUN ./PUNCT"
County officials in Maryland miscalculated how many ballots
they would need on Election Day -- and quickly
ran out in more than a dozen precincts .
Consider a distribution over two discrete variables x, y displayed in the following figure: Matrix view of 2 random variables.
*in the original question there is a table of X/Y*
Provide formulas to compute:
Joint probability:
Marginal probability for x:
Conditional probability given y: