Commit
·
6d41c99
1
Parent(s):
1c4513b
Changed distillation URL
Browse files
README.md
CHANGED
|
@@ -187,7 +187,7 @@ The details of the masking procedure for each sentence are the following:
|
|
| 187 |
### Pretraining
|
| 188 |
|
| 189 |
The model was trained on 8 16 GB V100 for 90 hours. See the
|
| 190 |
-
[training code](https://github.com/huggingface/transformers/tree/
|
| 191 |
details.
|
| 192 |
|
| 193 |
## Evaluation results
|
|
|
|
| 187 |
### Pretraining
|
| 188 |
|
| 189 |
The model was trained on 8 16 GB V100 for 90 hours. See the
|
| 190 |
+
[training code](https://github.com/huggingface/transformers/tree/main/examples/research_projects/distillation) for all hyperparameters
|
| 191 |
details.
|
| 192 |
|
| 193 |
## Evaluation results
|