Neural network outline. In the input layer, there is one neuron for each dimension in the combined vectors of the token window (each vector having length d). The hidden and output layers are as in conventional MLP networks . They have n
and |U| neurons, respectively. For simplicity, biases and some connections were omitted.