GPT3Trainer: GPT3ForTextGeneration: CUDA out of memory. Tried to allocate 100.00 MiB (GPU 0; 22.20 GiB total capacity; 5.90 GiB already allocated; 70.12 MiB free; 5.90 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF 想问下ModelScope训练的话 这个报错要怎么设置参数呢? 在PAI平台的JupyterLab 上面运行的
尝试一下cfg.train.dataloader.batch_size_per_gpu数值调小,或者用多卡进行训练,此回答整理自钉群“魔搭ModelScope开发者联盟群 ①”