Go to file
2024-03-26 14:19:07 +08:00
src first commit 2024-03-26 14:19:07 +08:00
.gitignore fixed .gitignore add readme.md 2021-04-02 10:10:11 +03:00
README.MD fixed .gitignore add readme.md 2021-04-02 10:10:11 +03:00

MALBERT

Malagasy Langage BERT - Strongly inspired by codertimo/BERT-pytorch but using pytorch integrated transformer module

Quickstart

**NOTICE : Your corpus should be one sentence per line

0. Prepare your corpus

Put train.txt, test.txt, valid.txt in folder dataset/corpus

1. Pretrain model

$python3 main.py 

Dependencies

  • python 3.8
  • torch >= 1.4
  • tokenizers

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update tests as appropriate.

License

MIT