Multi Label future token prediction head

use multilabel classifier to predict all future tokens