What is the correct way to perform gradient clipping in pytorch?
I have an exploding gradients problem, and I need to program my way around it.
A more complete example
optimizer.zero_grad()
loss, hidden = model(data, hidden, targets)
loss.backward()
torch.nn.utils.clip_grad_norm_(model.parameters(), args.clip)
optimizer.step()