pytorch 教程

AI拉呱
29 min readMay 5, 2020

--

pytorch教程

  • toch 是lua语言编写的tensor张量操作库。
  • PyTorch提供了两个高级功能: * 具有强大的GPU加速的张量计算(如Numpy) * 包含自动求导系统的深度神经网络

基础教程

核心组件

  • 张量
  • tensor 的Autograd
  • nn.Module类,用来建立任何其他神经类分类
  • 优化器
    - * 随机梯度下降(SGD),
    - * Adam, Adadelta, Adagrad, SpareAdam,
    - * L-BFGS,
    - * RMSprop
  • 损失函数
    - * 二元和多类交叉熵,
    - * mean squared and mean absolute errors
    - * smooth L1 loss
    - * neg log-likelihood loss
    - * Kullback-Leibler divergence

基本流程

  • 将神经网络构造为自定义类(从nn.Module集成),其中包含隐藏层张量以及forward通过各种层和激活函数传播输入张量的方法
  • 使用此forward方法通过网络传播特征张量得到一个输出的output张量
  • 计算了loss通过比较output在地上真相,并使用内置的损失函数
  • 传播的梯度loss使用自动分化能力(Autograd)与backward方法
  • 使用损耗的梯度来更新网络的权重(这是通过执行所谓的优化器的一个步骤来实现的)optimizer.step()。

dataset

  1. 数据需要分为:训练数据集(train)、验证集(valid)、测试集(test)==8:1:1
  2. 制作存放有图片路径及其标签的 txt
  3. Datasets类是pytorch读取数据的基类,
  4. 制作图片数据的索引(相对路径,相对训练的py文件的地址)
  5. 构建Dataset子类
  6. 数据增强与数据标准化
  7. 数据中心化,仅减均值
  8. 数据标准化:减均值,再除以标准差
  9. transforms的二十二个方法
  10. 裁剪 — — Crop
  11. 中心裁剪:transforms.CenterCrop
  12. 随机裁剪:transforms.RandomCrop
  13. 随机长宽比裁剪:transforms.RandomResizedCrop
  14. 上下左右中心裁剪:transforms.FiveCrop
  15. 上下左右中心裁剪后翻转,transforms.TenCrop
  16. 翻转和旋转 — — Flip and Rotation
  17. 依概率 p 水平翻转:transforms.RandomHorizontalFlip(p=0.5)
  18. 依概率 p 垂直翻转:transforms.RandomVerticalFlip(p=0.5)
  19. 随机旋转:transforms.RandomRotation
  20. 图像变换
  21. resize:transforms.Resize
  22. 标准化:transforms.Normalize转为 tensor,并归一化至[0–1]:transforms.ToTensor
  23. 填充:transforms.Pad
  24. 修改亮度、对比度和饱和度:transforms.ColorJitter
  25. 转灰度图:transforms.Grayscale
  26. 线性变换:transforms.LinearTransformation()
  27. 仿射变换:transforms.RandomAffine
  28. 依概率 p 转为灰度图:transforms.RandomGrayscale
  29. 将数据转换为 PILImage:transforms.ToPILImage
  30. transforms.Lambda:Apply a user-defined lambda as a transform.
  31. 对 transforms 操作,使数据增强更灵活
  32. transforms.RandomChoice(transforms), 从给定的一系列
    2.
  33. transforms 中选一个进行操作 transforms.RandomApply(transforms, p=0.5),给一个 transform 加上概率,依概率进行操作
  34. transforms.RandomOrder,将 transforms 中的操作随机打乱

模型

模型的搭建

  1. 首先,必须继承 nn.Module 这个类,要让 PyTorch 知道这个类是一个 Module。
  2. 其次,在__init__(self)中设置好需要的“组件”(如 conv、pooling、Linear、BatchNorm 等)。
  3. 最后,在 forward(self, x)中用定义好的“组件”进行组装,就像搭积木,把网络结构搭建 出来,这样一个模型就定义好了。
  4. 第一行是初始化,往后定义了一系列组件,如由 Conv2d 构成的 conv1,有 MaxPool2d 构成的 poo1l,这些操作均由 torch.nn 􏰀供,torch.nn 中的操作可查看文档

模型 Finetune

损失函数和优化器

torch 简单的操作

import torcht=torch.tensor([[1.,-1.],[1.,-1]])
print(t)
print(t.dtype)
print(t.shape)
print(t.device)

张量

# -*- coding:utf-8 -*-
# /usr/bin/python
'''
@Author : Errol
@Describe:
@Evn :
@Date : -
'''
import torch
import numpy as np
# 判断是否是张量
print(torch.is_tensor([[1,2],[2,3]]))
# 判断张量的元素个数
a = torch.tensor(2)
print(torch.numel(a))
tensor_a = torch.randn(1,2,3)
print(torch.numel(tensor_a))
tensor_b = torch.zeros(4,5)
print(tensor_b,type(tensor_b),type(type(tensor_b)))
# 创建对角为1,其他为0的2维张量
eyes = torch.eye(4)
print(eyes)
# numpy的array转化为torch的tensor
numpy_arr = np.array([[1,2],[2,3]])
print('numpy_arr',numpy_arr,type(numpy_arr),type(type(numpy_arr)))
arr_tensor = torch.from_numpy(numpy_arr)
print("arr_tensor",arr_tensor,type(arr_tensor),type(type(numpy_arr)))
arr_tensor[(1,-1)]=5
print(arr_tensor)
# 在区间start 和end上均匀间隔steps 个点
linspace_tensor = torch.linspace(0,10,steps = 100,)
print(linspace_tensor,type(linspace_tensor),type(type(linspace_tensor)))
print('torch.Size()',linspace_tensor.size())
# 在区间10^start 10^end上以对数刻度均匀间隔的steps个点。输出1维张量的长度为steps
logs_tensor = torch.logspace(-10,10,steps = 10)
print("logs_tensor",logs_tensor,type(logs_tensor))
# 全1 张量
ones_tensor = torch.ones((3,3))
print("ones_tensor",ones_tensor,type(ones_tensor),ones_tensor.size())
# 全0 tensor张量
zeros_tensor = torch.zeros(2,3)
print('zeros_tensor',zeros_tensor,zeros_tensor.size())
# 0到1内的随机数张量
rand_tensor = torch.randn(4)
print('rand_tensor',rand_tensor,type(rand_tensor))
rand_tensor = torch.randn([4,4])
print('rand_tensor',rand_tensor,type(rand_tensor))
# 固定间隔取值,左闭右开
arange_rand = torch.arange(1,4,0.2)
print('arange_rand',arange_rand,type(arange_rand),arange_rand.size())
# 张量前后串联
x = torch.randn([2,3])
y = torch.cat((x,x,x,x),0)# 按行拼接
print("y",y.size(),y)
y = torch.cat((x,x,x,x),1)# 按列拼接
print("y",y.size(),y)
# 拆分
z = torch.split(y,(1,1),dim =0)
print("z",z,type(z))
# 转至
y1 = torch.t(y)
print("y1", y1, type(y1),y1.size())
#
# # 创建正太分布的张量
# std_tensor = torch.normal(mean=torch.arange(1, 6))
# print('std_tensor',std_tensor,std_tensor.size())
# 保存对象
torch.save(z,f = './z.pkl')
# 加载对象
zz = torch.load(f = './z.pkl')
print('zz',zz,type(zz))
# 计算输入向量的绝对值
y2 = torch.abs(y1)
print('y2',y2,type(y2),y2.size())
x1 = torch.abs(torch.FloatTensor([-1.1,2.1,-6,-8]))
print(x1,type(x1),x1.size())
# 求每个张量值的反余弦值
a = torch.rand(4)
print("a",a)
a = torch.acos(a)
print("a反余弦函数",a)
# 求每个张量值的余弦值
a = torch.rand(4)
print("a",a)
a = torch.cos(a)
print("a余弦函数",a)
# 求每个张量值的双曲余弦值
a = torch.rand(4)
print("a",a)
a = torch.cosh(a)
print("a双曲余弦函数",a)
# 求每个张量元素值的反正弦值
a = torch.rand(4)
print("a",a)
a = torch.asin(a)
print("a反正弦函数",a)
# 求每个张量元素值的反正切函数
a = torch.rand(4)
print("a",a)
a = torch.atan(a)
print("a反正切",a)
# 张量加固定的标量值
b = torch.add(a,30)
print('b',b,b.size())
# 除法 张量的每个值除以标量值
zz = torch.div(b,3)
print("除法",zz,zz.size())
# 除法 张量的每个值除以另一个张量的每个值
zz = torch.div(b,b)
print("除法两个维度相同的张量相处",zz,zz.size())
# 复杂数学计算:tensor + (tensor1/tensor2)*value
aa = torch.addcdiv(a,2,a,b)
print('aa',aa,aa.size())
# 复杂计算:tensor + (tensor1*tensor2)*value
bb = torch.addcmul(a,2,a,b)
print('bb',bb,bb.size())
# 天井函数,对张量的每个元素向上取整
a = torch.rand(4)
print("a",a)
a = torch.ceil(a)
print("天井函数向上取整",a)
# 幂函数,以张量的每个元素作为底数,
a = torch.rand(4)
bb = torch.pow(a,4)
print('幂函数',bb,bb.size())
# 指数对每个张量的元素取以e为底的指数
exp_one = torch.exp(ones_tensor)
print("指数",exp_one,exp_one.size())
# 四舍五入
round_tensor = torch.round(exp_one)
print('四舍五入round_tensor',round_tensor)
# 计算自然对数
a = torch.randn(5)
b = torch.log(a)
print('以自然数为底的对数',b,b.size())
# 计算除法取余数
# 计算除法余数。 除数与被除数可能同时含有整数和浮点数。此时,余数的正负与被除数相同。
aa =torch.fmod(torch.Tensor([-3, -2, -1, 1, 2, 3]), 2)
print('除法取余数',aa)
# 求平均值
mean_aa = torch.mean(aa,0)
print('平均值',mean_aa)
# 求中位数
median_aa = torch.median(aa,0)
print("中位数", median_aa)
# 求众数
mode_aa = torch.median(a,0)
print("众数",mode_aa)
# 求标准差
std_tensor = torch.std(a,0)
print('标准差',std_tensor)
# 求和
sum_tensor = torch.sum(a)
print("求和",sum_tensor)
#方差
var_a = torch.var(a,0)
print("方差",var_a)
# 两个张量的对应元素值比较,相等为1,不等为0
aa_tensor = torch.eq(a,a)
print("判断相等",aa_tensor)
# 两个张量具有相同的形状和者相同的值,则为True
aa_equal = torch.equal(a,a)
print("相等",aa_equal)
# 找最大值
max_a = torch.max(a)
print('最大值',max_a)
# 找最小值
min_a = torch.min(a)
print('最小值',min_a)
# torch.Tensor是默认的tensor类型(torch.FlaotTensor)的简称。
aa = torch.FloatTensor([[-1,2,3],[3,-4,5]])
print(aa,type(aa),aa.size())
# 取绝对值覆盖原来的值 func_
aa.abs_()
print(aa,type(aa),aa.size())
# 扩展
x = torch.Tensor([[1],[2],[3]])
print(x.size())
x= x.expand(3,4)
print(x.size(),x)

网络搭建

  • class torch.nn.Module 所有网络的基类
# -*- coding:utf-8 -*-
# /usr/bin/python
'''
@Author : Errol
@Describe:
@Evn :
@Date : -
'''
import torch.nn as nn
import torch.nn.functional as F
class Model(nn.Module):
def __init__(self):
super(Model, self).__init__()
self.conv1 = nn.Conv2d(1,20,5)
self.conv2 = nn.Conv2d(20, 20, 5)
def forward(self, x):
x=F.relu(self.conv1(x))
return F.relu(self.conv2(x))
model = Model()
print('model.conv1',model.conv1)
print("model.conv2", model.conv2)
for sub_module in model.children():
print(sub_module,type(type(sub_module)))
# 当前模型的迭代器
for module in model.modules():
print(module,type(type(module)))
# 返回,包含模型当前子模块的迭代器,yield模块名字和模块本身
for name, module in model.named_children():
if name in ['conv2d', 'conv5']:
print(module)
# 返回一个 包含模型所有参数 的迭代器。
print(" 返回一个 包含模型所有参数 的迭代器。")
for param in model.parameters():
print(type(param.data), param.size())
print('module.state_dict().keys()',module.state_dict().keys())import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
class Net(nn.Module): def __init__(self):
super(Net, self).__init__()
# 1 input image channel, 6 output channels, 3x3 square convolution
# kernel
self.conv1 = nn.Conv2d(1, 6, 3)
self.conv2 = nn.Conv2d(6, 16, 3)
# an affine operation: y = Wx + b
self.fc1 = nn.Linear(16 * 6 * 6, 120) # 6*6 from image dimension
self.fc2 = nn.Linear(120, 84)
self.fc3 = nn.Linear(84, 10)
def forward(self, x):
# Max pooling over a (2, 2) window
x = F.max_pool2d(F.relu(self.conv1(x)), (2, 2))
# If the size is a square you can only specify a single number
x = F.max_pool2d(F.relu(self.conv2(x)), 2)
x = x.view(-1, self.num_flat_features(x))
x = F.relu(self.fc1(x))
x = F.relu(self.fc2(x))
x = self.fc3(x)
return x
def num_flat_features(self, x):
size = x.size()[1:] # all dimensions except the batch dimension
num_features = 1
for s in size:
num_features *= s
return num_features
net = Net()
print(net)
params = list(net.parameters())
print(len(params))
print(params[0].size()) # conv1's .weight
# 模拟网络输入
input_data = torch.randn(1,1,32,32)
out = net(input_data)
print("out",out)
#损失函数
input_data = torch.randn(1,1,32,32)
output = net(input_data)
target = torch.randn(10)
target = target.view(1,-1)
print('target',target,type(target),target.size())
criterion = nn.MSELoss()
loss = criterion(output, target)
print('误差为:',loss)
# 反向传播
net.zero_grad()
print('反向传播前的conv1的偏差值')
print(net.conv1.bias.grad)
loss.backward()
print('反向传播前的conv1的偏差值')
print(net.conv1.bias.grad)
# 更新权重
learning_rate = 0.01
for f in net.parameters():
f.data.sub_(f.grad.data * learning_rate)
params = list(net.parameters())
print(len(params))
print(params[0].size())
# 优化器
optimizer = optim.SGD(net.parameters(),Ir = 0.01)
optimizer.zero_grad()
output = net(input_data)
loss = criterion(output,target)
loss.backward()
optimizer.step()

torchaudio 学习

# -*- coding:utf-8 -*-
# /usr/bin/python
'''
@Author : Errol
@Describe:
@Evn :
@Date : -
'''
import torch
import torchaudio
import matplotlib.pyplot as plt
file_path = './data/000-test.wav'
waveform,sample_rate = torchaudio.load(file_path)
print('waveform',waveform,type(waveform),'\nsample_rate:',sample_rate,type(sample_rate))
print("shape of waveform:{}".format(waveform.size))
print("sample rate of waveform:{}".format(sample_rate))
print("waveform.t(){}".format(waveform.t()))
print('type(waveform.t()))',type(waveform.t()))
print("Min of waveform: {}\nMax of waveform: {}\nMean of waveform: {}".format(waveform.min(), waveform.max(), waveform.mean()))
# 正则化音频数据
def normalize(waveform):
tensor_minusmean = waveform - waveform.mean()
return tensor_minusmean/(tensor_minusmean.abs().max())
plt.figure()
plt.plot(waveform.t().numpy())
#plt.show()
# 声谱图
specgram = torchaudio.transforms.Spectrogram()(waveform)
print('specgram',specgram,type(specgram),specgram.size())
log_specgram = specgram.log2()
print('log_specgram',log_specgram,type(log_specgram),log_specgram.size())
log_specgram = log_specgram[0,:,:]
print('log_specgram2',log_specgram,type(log_specgram),log_specgram.size())
numpy_specgram = log_specgram.numpy()
print('numpy_specgram',numpy_specgram,type(numpy_specgram))
# plt.figure()
# # plt.imshow(numpy_specgram,cmap ="gray")
# # plt.show()
# mel 梅尔谱图
melspecgram = torchaudio.transforms.MelSpectrogram()(waveform)
print('melspecgram',melspecgram,type(melspecgram),melspecgram.size())
log_specgram = melspecgram.log2()
print('log_mel_specgram',log_specgram,type(log_specgram),log_specgram.size())
log_specgram = log_specgram[0,:,:].detach()
print('log_specgram2',log_specgram,type(log_specgram),log_specgram.size())
numpy_specgram = log_specgram.numpy()
print('numpy_specgram',numpy_specgram,type(numpy_specgram))
# plt.figure()
# plt.imshow(numpy_specgram,cmap ="gray")
# plt.show()
# 重采样
new_sampel_rate = sample_rate / 2
channels = 0
transformed = torchaudio.transforms.Resample(sample_rate,new_sampel_rate)(waveform[channels,:].view(1,-1))
print('transformed',transformed, type(transformed),transformed.size())
# plt.figure()
# plt.plot(transformed[0,:].numpy())
# plt.show()
# 编码音频数据
transformed11 = torchaudio.transforms.MuLawEncoding()(waveform)
print('transformed33',transformed11, type(transformed11),transformed11.size())
transformed = transformed11[0,:]
print('transformed44',transformed, type(transformed),transformed.size())
# plt.figure()
# plt.plot(transformed.numpy())
# plt.show()
# 解码
restructed = torchaudio.transforms.MuLawDecoding()(transformed11)
print('restructed',restructed, type(restructed),restructed.size())
print('waveform',waveform, type(waveform),waveform.size())
plt.figure()
plt.plot(restructed[0,:].numpy())
# plt.show()
# 对比编码前和解码后的音频差异
err = ((waveform -restructed).abs() / waveform.abs()).median()
print("差异:{:.2%}".format(err),type(err),err.size())
# 加载kaldin_fft = 400.0
frame_length = n_fft / sample_rate * 1000.0
frame_shift = frame_length / 2.0
params = {
"channel": 0,
"dither": 0.0,
"window_type": "hanning",
"frame_length": frame_length,
"frame_shift": frame_shift,
"remove_dc_offset": False,
"round_to_power_of_two": False,
"sample_frequency": sample_rate,
}
specgram = torchaudio.compliance.kaldi.spectrogram(waveform, **params)print("Shape of spectrogram: {}".format(specgram.size()))plt.figure()
plt.imshow(specgram.t().numpy(), cmap='gray')
plt.show()
# We also support computing the filterbank features from waveforms, matching Kaldi’s implementation.
fbank = torchaudio.compliance.kaldi.fbank(waveform, **params)
print("Shape of fbank: {}".format(fbank.size()))plt.figure()
plt.imshow(fbank.t().numpy(), cmap='gray')

建立神经网络

步骤

  • 1.定义一个包含可训练参数的神经网络
  • 2.迭代整个输入
  • 3.通过神经网络处理输入
  • 4.计算损失(loss)
  • 5.反向传播梯度到神经网络的参数
  • 6.更新网络的参数,典型的用一个简单的更新方法:weight = weight — learning_rate *gradient

示例

numpy NN 示例

# -*- coding:utf-8 -*-
# /usr/bin/python
'''
-------------------------------------------------
File Name : numpy_nn
Description : AIM: numpy nn
Functions: 1.
2.
Envs : python ==
pip install -i https://pypi.douban.com/simple
Author : errol
Date : 2020/5/4 21:57
CodeStyle : 规范,简洁,易懂,可阅读,可维护,可移植!
-------------------------------------------------
Change Activity:
2020/5/4 : text
-------------------------------------------------
'''
import numpy as np# N是批量大小; D_in是输入维度;
# 49/5000 H是隐藏的维度; D_out是输出维度。
N, D_in, H, D_out = 64, 1000, 100, 10
# 创建随机输入和输出数据
x = np.random.randn(N, D_in)
y = np.random.randn(N, D_out)
# 随机初始化权重
w1 = np.random.randn(D_in, H)
w2 = np.random.randn(H, D_out)
learning_rate = 1e-6
for t in range(5000):
# 前向传递:计算预测值y
h = x.dot(w1)
h_relu = np.maximum(h, 0)
y_pred = h_relu.dot(w2)
# 计算和打印损失loss
loss = np.square(y_pred - y).sum()
print(t, loss)
# 反向传播,计算w1和w2对loss的梯度
grad_y_pred = 2.0 * (y_pred - y)
grad_w2 = h_relu.T.dot(grad_y_pred)
grad_h_relu = grad_y_pred.dot(w2.T)
grad_h = grad_h_relu.copy()
grad_h[h < 0] = 0
grad_w1 = x.T.dot(grad_h)
# 更新权重
w1 -= learning_rate * grad_w1

pytorch 张量

# -*- coding: utf-8 -*-import torch
dtype = torch.float
device = torch.device("cpu")
# device = torch.device(“cuda:0”)#取消注释以在GPU上运行
# N是批量大小; D_in是输入维度;
# H是隐藏的维度; D_out是输出维度。
N, D_in, H, D_out = 64, 1000, 100, 10
#创建随机输入和输出数据
x = torch.randn(N, D_in, device=device, dtype=dtype)
y = torch.randn(N, D_out, device=device, dtype=dtype)
# 随机初始化权重
w1 = torch.randn(D_in, H, device=device, dtype=dtype)
w2 = torch.randn(H, D_out, device=device, dtype=dtype)
learning_rate = 1e-6
for t in range(500):
# 前向传递:计算预测y
h = x.mm(w1)
h_relu = h.clamp(min=0)
y_pred = h_relu.mm(w2)
# 计算和打印损失
loss = (y_pred - y).pow(2).sum().item()
print(t, loss)
# Backprop计算w1和w2相对于损耗的梯度
grad_y_pred = 2.0 * (y_pred - y)
grad_w2 = h_relu.t().mm(grad_y_pred)
grad_h_relu = grad_y_pred.mm(w2.t())
grad_h = grad_h_relu.clone()
grad_h[h < 0] = 0
grad_w1 = x.t().mm(grad_h)
# 使用梯度下降更新权重
w1 -= learning_rate * grad_w1
w2 -= learning_rate * grad_w2

--

--

AI拉呱
AI拉呱

Written by AI拉呱

专注于人工智与网络安全方面的研究,现任资深算法研究员,兼职硕士研究生导师;热爱机器学习和深度学习算法应用,深耕大语言模型微调、量化、私域部署。曾获多次获得AI竞赛大奖,拥有多项发明专利和学术论文。对于AI算法有自己独特见解和经验。曾辅导十几位非计算机学生转行到算法岗位就业。

No responses yet