用户
 找回密码
 立即注册
发表于 2021-11-25 20:19:18
61591
--------------------Tesla K80cuda11.1
------------------

训练使用的命令:
python tacotron2.py train_dataset=./train.json validation_datasets=./val.json trainer.max_epochs=1500 trainer.accelerator=null trainer.check_val_every_n_epoch=1
报错信息:
(myconda) root@032fa32f07bd:/home/nemo/TTS# python tacotron2.py train_dataset=./train.json validation_da.accelerator=null trainer.check_val_every_n_epoch=1
[NeMo W 2021-11-25 12:13:53 optimizers:47] Apex was not found. Using the lamb optimizer will error out.
################################################################################
### WARNING, path does not exist: KALDI_ROOT=/mnt/matylda5/iveselyk/Tools/kaldi-trunk
###          (please add 'export KALDI_ROOT=<your_path>' in your $HOME/.profile)
###          (or run as: KALDI_ROOT=<your_path> python <your_script>.py)
################################################################################

[NeMo W 2021-11-25 12:13:55 experimental:28] Module <class 'nemo.collections.asr.data.audio_to_text_dalidy for production and is not fully supported. Use at your own risk.
coming
GPU available: True, used: True
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
[NeMo I 2021-11-25 12:13:55 exp_manager:271] Experiments will be logged at /home/nemo/TTS/nemo_experimen
[NeMo I 2021-11-25 12:13:55 exp_manager:625] TensorboardLogger has been set up
[NeMo W 2021-11-25 12:13:55 nemo_logging:349] /root/miniconda3/envs/myconda/lib/python3.7/site-packages/41: LightningDeprecationWarning: `ModelCheckpoint(every_n_val_epochs)` is deprecated in v1.4 and will betead.
      "`ModelCheckpoint(every_n_val_epochs)` is deprecated in v1.4 and will be removed in v1.6."

[NeMo I 2021-11-25 12:13:55 collections:173] Dataset loaded with 44 files totalling 0.03 hours
[NeMo I 2021-11-25 12:13:55 collections:174] 0 files were filtered totalling 0.00 hours
[NeMo W 2021-11-25 12:13:55 nemo_logging:349] /root/miniconda3/envs/myconda/lib/python3.7/site-packages/This DataLoader will create 4 worker processes in total. Our suggested max number of worker in current soader is going to create. Please be aware that excessive worker creation might get DataLoader running slid potential slowness/freeze if necessary.
      cpuset_checked))

[NeMo I 2021-11-25 12:13:55 collections:173] Dataset loaded with 4 files totalling 0.00 hours
[NeMo I 2021-11-25 12:13:55 collections:174] 0 files were filtered totalling 0.00 hours
[NeMo W 2021-11-25 12:13:55 nemo_logging:349] /root/miniconda3/envs/myconda/lib/python3.7/site-packages/This DataLoader will create 8 worker processes in total. Our suggested max number of worker in current soader is going to create. Please be aware that excessive worker creation might get DataLoader running slid potential slowness/freeze if necessary.
      cpuset_checked))

[NeMo I 2021-11-25 12:13:55 features:262] PADDING: 16
[NeMo I 2021-11-25 12:13:55 features:279] STFT using torch
LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]
[NeMo I 2021-11-25 12:13:58 modelPT:544] Optimizer config = Adam (
    Parameter Group 0
        amsgrad: False
        betas: (0.9, 0.999)
        eps: 1e-08
        lr: 0.001
        weight_decay: 1e-06
    )
[NeMo I 2021-11-25 12:13:58 lr_scheduler:625] Scheduler "<nemo.core.optim.lr_scheduler.CosineAnnealing o
    will be used during training (effective maximum steps = 1500) -
    Parameters :
    (min_lr: 1.0e-05
    max_steps: 1500
    )

  | Name                       | Type               | Params
------------------------------------------------------------------
0 | audio_to_melspec_precessor | FilterbankFeatures | 0     
1 | text_embedding             | Embedding          | 35.3 K
2 | encoder                    | Encoder            | 5.5 M
3 | decoder                    | Decoder            | 18.3 M
4 | postnet                    | Postnet            | 4.3 M
5 | loss                       | Tacotron2Loss      | 0     
------------------------------------------------------------------
28.2 M    Trainable params
0         Non-trainable params
28.2 M    Total params
112.611   Total estimated model params size (MB)
Validation sanity check:   0%|                                                                          [NeMo W 2021-11-25 12:14:01 patch_utils:50] torch.stft() signature has been updated for PyTorch 1.7+
    Please update PyTorch to remain compatible with later versions of NeMo.
[NeMo W 2021-11-25 12:14:02 nemo_logging:349] /root/miniconda3/envs/myconda/lib/python3.7/site-packages/erWarning: The number of training samples (1) is smaller than the logging interval Trainer(log_every_n_ss if you want to see logs for the training epoch.
      f"The number of training samples ({self.num_training_batches}) is smaller than the logging interva

Epoch 0:   0%|                                                                                          [NeMo W 2021-11-25 12:14:10 nemo_logging:349] /root/miniconda3/envs/myconda/lib/python3.7/site-packages/ctor/result.py:406: LightningDeprecationWarning: One of the returned values {'progress_bar'} has a `gradhaviour will change in v1.6. Please detach it manually: `return {'loss': ..., 'something': something.det
      f"One of the returned values {set(extra.keys())} has a `grad_fn`. We will detach it automatically"

free(): invalid pointer
Aborted (core dumped)

使用道具 举报 回复
发表于 2021-11-25 20:50:17
emmmm 此贴终结 K80换2080Ti就没这个问题了
使用道具 举报 回复 支持 反对
发新帖
您需要登录后才可以回帖 登录 | 立即注册