fixed .gitignore add readme.md

This commit is contained in:
Setra Solofoniaina 2021-04-02 10:10:11 +03:00
parent 4b6c65862b
commit 00fcc2633b
2 changed files with 32 additions and 1 deletions

3
.gitignore vendored
View File

@ -4,4 +4,5 @@
/src/model/__pycache__
/src/output/*
/src/trainer/__pycache__
/.vscode
/.vscode
/src/model/embedding/__pycache__

30
README.MD Normal file
View File

@ -0,0 +1,30 @@
# MALBERT
Malagasy Langage BERT - Strongly inspired by [codertimo/BERT-pytorch](https://github.com/codertimo/BERT-pytorch) but using pytorch integrated transformer module
## Quickstart
**NOTICE : Your corpus should be one sentence per line
### 0. Prepare your corpus
Put train.txt, test.txt, valid.txt in folder dataset/corpus
### 1. Pretrain model
```
$python3 main.py
```
## Dependencies
* python 3.8
* torch >= 1.4
* tokenizers
## Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
## License
[MIT](https://choosealicense.com/licenses/mit/)