BERTForMaskedLM¶
- class lucid.models.BERTForMaskedLM(config: BERTConfig)¶
The BERTForMaskedLM class attaches a masked language modeling head to the BERT backbone.
Class Signature¶
class BERTForMaskedLM(config: BERTConfig)
Parameters¶
config (BERTConfig): BERT configuration for masked language modeling.
Methods¶
- BERTForMaskedLM.forward(input_ids: LongTensor | None = None, attention_mask: Tensor | None = None, token_type_ids: LongTensor | None = None, position_ids: LongTensor | None = None, inputs_embeds: FloatTensor | None = None) Tensor
Compute token logits over the vocabulary for each sequence position.
- BERTForMaskedLM.get_loss(labels: Tensor, input_ids: LongTensor | None = None, attention_mask: Tensor | None = None, token_type_ids: LongTensor | None = None, position_ids: LongTensor | None = None, inputs_embeds: FloatTensor | None = None, *, ignore_index: int = -100, reduction: str | None = 'mean') Tensor
Compute masked language modeling loss from token labels.
- BERTForMaskedLM.create_masked_lm_inputs(input_ids: Tensor, attention_mask: Tensor | None = None, special_tokens_mask: Tensor | None = None, *, mask_token_id: int = 103, mlm_probability: float = 0.15, mask_replace_prob: float = 0.8, random_replace_prob: float = 0.1, ignore_index: int = -100) tuple[Tensor, Tensor]
Build masked inputs and labels using the standard MLM masking policy (mask/random/original replacements).
- BERTForMaskedLM.predict_token_ids(input_ids: LongTensor | None = None, attention_mask: Tensor | None = None, token_type_ids: LongTensor | None = None, position_ids: LongTensor | None = None, inputs_embeds: FloatTensor | None = None) Tensor
Return argmax token predictions per position.
- BERTForMaskedLM.get_accuracy(labels: Tensor, input_ids: LongTensor | None = None, attention_mask: Tensor | None = None, token_type_ids: LongTensor | None = None, position_ids: LongTensor | None = None, inputs_embeds: FloatTensor | None = None, *, ignore_index: int = -100) Tensor
Compute token-level accuracy while ignoring masked-out label indices.
- BERTForMaskedLM.get_loss_from_text(tokenizer: BERTTokenizerFast, text_a: str, text_b: str | None = None, *, device: Literal['cpu', 'gpu'] = 'cpu', mask_token_id: int | None = None, mlm_probability: float = 0.15, mask_replace_prob: float = 0.8, random_replace_prob: float = 0.1, ignore_index: int = -100, reduction: str | None = 'mean') Tensor
Compute MLM loss directly from raw text (with internal masking preparation).
- BERTForMaskedLM.predict_token_ids_from_text(tokenizer: BERTTokenizerFast, text_a: str, text_b: str | None = None, *, device: Literal['cpu', 'gpu'] = 'cpu') Tensor
Predict token IDs directly from raw text input.
Examples¶
>>> import lucid.models as models
>>> model = models.bert_for_masked_lm_base()
>>> print(model)
BERTForMaskedLM(...)
>>> masked_input_ids, labels = model.create_masked_lm_inputs(input_ids)
>>> loss = model.get_loss(labels=labels, input_ids=masked_input_ids)
>>> acc = model.get_accuracy(labels=labels, input_ids=masked_input_ids)
>>> tokenizer = models.BERTTokenizerFast.from_pretrained(".data/bert/pretrained")
>>> loss = model.get_loss_from_text(
... tokenizer=tokenizer,
... text_a="Machine learning helps us build useful systems.",
... text_b="Tokenization quality strongly affects language model performance.",
... device="gpu",
... )
>>> pred_ids = model.predict_token_ids_from_text(
... tokenizer=tokenizer,
... text_a="Machine learning is [MASK].",
... device="gpu",
... )