申请会员ID：君匡【未报到，已注销】

发表于 2020-5-4 23:11

1、申请 I D ：君匡
2、个人邮箱：1533859388@qq.com

### 个人简介
计算机专业大三学生一枚，时间充裕的时候喜欢捣鼓些黑科技玩。这是第二次申请了，还是很期待的。原文已经同步发表在我自己的博客上面了：http://www.clzly.xyz:8080/2020/05/python/87370e1f/

这一次是来分享我在百度paddlepaddle平台获得了比赛猫十二分类的经验和部分源代码。

概念性的介绍链接放在了我之前的博客里：

[飞桨图像分类帮我云撸猫](http://www.clzly.xyz:8080/2020/04/python/bf62f4ed/)

## 内容介绍：
利用训练的模型来预测数据所属的类别，进行适当优化，将模型准确率提高。
本数据集包含12种类的猫的图片。

## 项目原理
### 项目构思

![图片1.png](http://www.clzly.xyz:8080/2020/05/python/87370e1f/%E5%9B%BE%E7%89%871.png)

### 算法模型

CNN分类模型：
本次项目中使用的是ResNet101预训练模型，之后构建网路是使用的ResNet模型，优化方法在第一阶段使用的是ADAM，在第二阶段是使用的SGD.

### 数据集

猫脸识别-12种猫分类数据集，分为transet和TestSet，共有12种猫的图片和标注数据

## 项目实现
### 实践流程
1.准备数据，解压数据集
#!unzip data/data10954/cat_12_test.zip -d ./
#!unzip data/data10954/cat_12_train.zip -d ./

然后对图片进行预处理，完成归一化。最后整理成reader对象。
```python

DATA_DIR = './'

img_mean = np.array().reshape((3, 1, 1))
img_std = np.array().reshape((3, 1, 1))

def resize_short(img, target_size):
   percent = float(target_size) / min(img.size, img.size)
   resized_width = int(round(img.size * percent))
   resized_height = int(round(img.size * percent))
   img = img.resize((resized_width, resized_height), Image.LANCZOS)
   return img

#裁剪
def crop_image(img, target_size, center):
   width, height = img.size
   size = target_size
   if center == True:
            w_start = (width - size) / 2
            h_start = (height - size) / 2
   else:
            w_start = np.random.randint(0, width - size + 1)
            h_start = np.random.randint(0, height - size + 1)
   w_end = w_start + size
   h_end = h_start + size
   img = img.crop((w_start, h_start, w_end, h_end))
   return img

def random_crop(img, size, scale=, ratio=):
   aspect_ratio = math.sqrt(np.random.uniform(*ratio))
   w = 1. * aspect_ratio
   h = 1. / aspect_ratio

   bound = min((float(img.size) / img.size) / (w**2),
            (float(img.size) / img.size) / (h**2))
   scale_max = min(scale, bound)
   scale_min = min(scale, bound)

   target_area = img.size * img.size * np.random.uniform(scale_min,
                                                         scale_max)
   target_size = math.sqrt(target_area)
   w = int(target_size * w)
   h = int(target_size * h)

   i = np.random.randint(0, img.size - w + 1)
   j = np.random.randint(0, img.size - h + 1)

   img = img.crop((i, j, i + w, j + h))
   img = img.resize((size, size), Image.LANCZOS)
   return img

#角度
def rotate_image(img):
   angle = np.random.randint(-10, 11)
   img = img.rotate(angle)
   return img

#概率的图像增强
def distort_color(img):
   def random_brightness(img, lower=0.5, upper=1.5):
            e = np.random.uniform(lower, upper)
            return ImageEnhance.Brightness(img).enhance(e)

   def random_contrast(img, lower=0.5, upper=1.5):
            e = np.random.uniform(lower, upper)
            return ImageEnhance.Contrast(img).enhance(e)

   def random_color(img, lower=0.5, upper=1.5):
            e = np.random.uniform(lower, upper)
            return ImageEnhance.Color(img).enhance(e)

   ops =
   np.random.shuffle(ops)

   img = ops(img)
   img = ops(img)
   img = ops(img)

   return img

#图片综合处理
def process_image(sample, mode, color_jitter, rotate):
   img_path = sample

   img = Image.open(img_path)
   if mode == 'train':
            if rotate: img = rotate_image(img)
            img = random_crop(img, DATA_DIM)
   else:
            img = resize_short(img, target_size=256)
            img = crop_image(img, target_size=DATA_DIM, center=True)
   if mode == 'train':
            if color_jitter:
                     img = distort_color(img)
            if np.random.randint(0, 2) == 1:
                     img = img.transpose(Image.FLIP_LEFT_RIGHT)

   if img.mode != 'RGB':
            img = img.convert('RGB')

   img = np.array(img).astype('float32').transpose((2, 0, 1)) / 255
   img -= img_mean
   img /= img_std

   if mode == 'train' or mode == 'val':
            return img, sample
   elif mode == 'test':
            return

#创建reader
def _reader_creator(file_list,
               mode,
               shuffle=False,
               color_jitter=False,
               rotate=False,
               data_dir=DATA_DIR):
   def reader():

            with open(file_list) as flist:
                     full_lines =
                     if shuffle:
                           np.random.shuffle(full_lines)
                     if mode == 'train' and os.getenv('PADDLE_TRAINING_ROLE'):
            # distributed mode if the env var `PADDLE_TRAINING_ROLE` exits
                           trainer_id = int(os.getenv("PADDLE_TRAINER_ID", "0"))
                           trainer_count = int(os.getenv("PADDLE_TRAINERS", "1"))
                           per_node_lines = len(full_lines) // trainer_count
                           lines = full_lines[trainer_id * per_node_lines:(trainer_id + 1)
                              * per_node_lines]
                           print(
               "read images from %d, length: %d, lines length: %d, total: %d"
               % (trainer_id * per_node_lines, per_node_lines, len(lines),
                  len(full_lines)))
                     else:
                           lines = full_lines

                     for line in lines:
                           if mode == 'train' or mode == 'val':
                                    img_path, label = line.split('\t')
                                    img_path = img_path.replace("JPEG", "jpeg")
                                    img_path = os.path.join(data_dir, img_path)
                                    yield img_path, int(label)
                           elif mode == 'test':
               #img_path = os.path.join(data_dir, line)
                                    img_path, label = line.split('\t')
                                    img_path = img_path.replace("JPEG", "jpeg")
                                    img_path = os.path.join(data_dir, img_path)
                                    yield

   mapper = functools.partial(
            process_image, mode=mode, color_jitter=color_jitter, rotate=rotate)

   mapper = functools.partial(
            process_image, mode=mode, color_jitter=color_jitter, rotate=rotate)

   return paddle.reader.xmap_readers(mapper, reader, THREAD, BUF_SIZE)

def train(data_dir=DATA_DIR):
   file_list = os.path.join(data_dir, 'train_split_list.txt')
   return _reader_creator(
   file_list, 'train', shuffle=True, color_jitter=False, rotate=False, data_dir=data_dir)

def val(data_dir=DATA_DIR):
   file_list = os.path.join(data_dir, 'val_split_list.txt')
   return _reader_creator(file_list, 'val', shuffle=False, data_dir=data_dir)

def test(data_dir=DATA_DIR):
   file_list = os.path.join(data_dir, 'test_list.txt')
   return _reader_creator(file_list, 'test', shuffle=False, data_dir=data_dir)
```
### 2.配置网络
#### （1）网络搭建
**CNN网络模型**

在配置网络中尝试使用过**Lenet-5，alexNet，resnet，resnext**等模型。

简单记录一下：

①Lenet5是卷积神经网络使用卷积、池化、非线性三个层作为一个系列，随着网络越来越深，图像的宽度和高度都在缩小，信道数量一直在增加。目前，一个或者多个卷积层后边跟一个池化层，再接上一个全连接层的排列方式非常常用。

②AlexNet：网络结构：8层网络，使用了relu函数，头两个全连接层使用了0.5的dropout。

预处理：先down-sample成最短边为256的图像，然后剪出中间的256*256的图像，再减均值做归一化。在训练时，做数据增强，随机提取出227*227以及水平镜像版本的图像。除了数据增强，还使用了PCA对RGB像素降维的方式来缓和过拟合问题。

超参数：SGD，学习率0.01，batch_size为128，momentum为0.9，weight decay为0.0005。每当validation error不再下降时，学习率除以10。权重初始化用（0，0.01）的高斯分布，二四五卷积层和全连接层的bias初始化为1（给relu提供正值利于加速前期训练），其余bias初始化为0。

③ResNet：对于网络加深，会出现梯度消失或梯度爆炸，这个问题可以通过正则初始化和BN来解决。

普通直连的卷积神经网络和ResNet的最大区别在于，ResNet 有很多旁路的支线将输入直接连到后面的层,使得后面的层可以直接学习残差,这种结构也被称为shortcut或skip connections。

传统的卷积层或全连接层在信息传递时，或多或少会存在信息丢失、损耗等问题。ResNet在某种程度上解决了这个问题，通过直接将输入信息绕道传到输出，保护信息的完整性，整个网络则只需要学习输入、输出差别的那一部分，简化学习目标和难度。

Bottleneck构建模块，节省计算时间进而缩小整个模型训练所需的时间，但是对模型精度没有影响。

④resnext：网络结构简明，模块化。需要手动调节的超参少。与ResNet相比，相同的参数个数，结果更好：例如一个 101 层的 ResNext 网络和 200 层的 ResNet 准确度差不多，但是计算量只有后者的一半。
```python

##定义输入层
image=fluid.layers.data(name='image',shape=train_core1["input_size"],dtype='float32')
label=fluid.layers.data(name='label',shape=,dtype='int64')

##停止梯度下降
pool=resnet(image)
pool.stop_gradient=True

##创建主程序来预训练
base_model_program=fluid.default_main_program().clone()
model=fluid.layers.fc(input=pool,size=train_core1["class_dim"],act='softmax')

##定义损失函数和准确率函数
cost=fluid.layers.cross_entropy(input=model,label=label)
avg_cost=fluid.layers.mean(cost)
acc=fluid.layers.accuracy(input=model,label=label)

##定义优化方法
optimizer=fluid.optimizer.AdamOptimizer(learning_rate=train_core1["learning_rate"])
opts=optimizer.minimize(avg_cost)

##定义训练场所
use_gpu=train_core1["use_gpu"]
place=fluid.CUDAPlace(0) if use_gpu else fluid.CPUPlace()
exe=fluid.Executor(place)

##进行参数初始化
exe.run(fluid.default_startup_program())
```
其中，我们完全可以修改优化函数，例如：
![图片2.png](http://www.clzly.xyz:8080/2020/05/python/87370e1f/%E5%9B%BE%E7%89%872.png)

#### （4）训练并保存模型

```python

##定义数据维度
feeder=fluid.DataFeeder(place=place,feed_list=)

now_acc=0
logger.info("开始第二批训练数据。。。")
##保存预测模型
save_path = 'models/step_2_model/'
for pass_id in range(train_core2["num_epochs"]):
##训练
for batch_id,data in enumerate(train_reader()):
   train_cost,train_acc=exe.run(program=fluid.default_main_program(),feed=feeder.feed(data),fetch_list=)
   if batch_id%50==0:
         print('Pass:%d, Batch:%d, Cost:%0.5f, Accuracy:%0.5f' %
               (pass_id, batch_id, train_cost, train_acc))
##测试
test_accs=[]
test_costs=[]
for batch_id,data in enumerate(test_reader()):
   test_cost,test_acc=exe.run(program=test_program,feed=feeder.feed(data), fetch_list=)
   test_accs.append(test_acc)
   test_costs.append(test_cost)
test_cost = (sum(test_costs) / len(test_costs))
test_acc = (sum(test_accs) / len(test_accs))
logger.info('Test:%d, Cost:%0.5f, Accuracy:%0.5f' % (pass_id, test_cost, test_acc))
now_acc=test_acc
if now_acc>last_acc:
   last_acc=now_acc
   logger.info("临时保存第 {0}批次的训练结果，准确率为 acc1 {1}".format(pass_id, now_acc))
   ##删除旧的模型文件
   shutil.rmtree(save_path, ignore_errors=True)
   ##创建保持模型文件目录
   os.makedirs(save_path)
   ##保存预测模型
   fluid.io.save_inference_model(save_path, feeded_var_names=, target_vars=, executor=exe)
logger.info("第二批训练数据结束。")

```

### 简单记录一下各个阶段的调试过程：
```python
def convolutional_neural_network(img):
# 第一个卷积-池化层
conv_pool_1 = fluid.nets.simple_img_conv_pool(
   input=img,       # 输入图像
   filter_size=3, # 滤波器的大小
   num_filters=32, # filter 的数量。它与输出的通道相同
   pool_size=2,    # 池化核大小2*2
   pool_stride=2, # 池化步长
   conv_padding=1,
   act="relu")    # 激活类型
conv_pool_1 = fluid.layers.batch_norm(conv_pool_1)
# 第二个卷积-池化层
conv_pool_2 = fluid.nets.simple_img_conv_pool(
   input=conv_pool_1,
   filter_size=3,
   num_filters=64,
   pool_size=2,
   pool_stride=2,
   conv_padding=1,
   act="relu")
conv_pool_2 = fluid.layers.batch_norm(conv_pool_2)
# 第三个卷积-池化层
conv_pool_3 = fluid.nets.simple_img_conv_pool(
   input=conv_pool_2,
   filter_size=3,
   num_filters=128,
   pool_size=2,
   conv_padding=1,
   pool_stride=2,
   act="relu")
conv_pool_3 = fluid.layers.batch_norm(conv_pool_3)
# 以softmax为激活函数的全连接输出层，10类数据输出10个数字
prediction = fluid.layers.fc(input=conv_pool_3, size=12, act='softmax')
return prediction
```

图像预处理为100*100，训练轮数为20轮，adam学习为0.00001，结构模型仍是cnn，

![图片3.png](http://www.clzly.xyz:8080/2020/05/python/87370e1f/%E5%9B%BE%E7%89%873.png)

此时测试集准确度为0.36

第二次实验：
将第三个卷积池化层的padding设置为0，其余未修改。
```python
def convolutional_neural_network(img):
# 第一个卷积-池化层
conv_pool_1 = fluid.nets.simple_img_conv_pool(
   input=img,       # 输入图像
   filter_size=5, # 滤波器的大小
   num_filters=20, # filter 的数量。它与输出的通道相同
   pool_size=2,    # 池化核大小2*2
   pool_stride=2, # 池化步长
   act="relu")    # 激活类型
conv_pool_1 = fluid.layers.batch_norm(conv_pool_1)
# 第二个卷积-池化层
conv_pool_2 = fluid.nets.simple_img_conv_pool(
   input=conv_pool_1,
   filter_size=5,
   num_filters=50,
   pool_size=2,
   pool_stride=2,
   act="relu")
conv_pool_2 = fluid.layers.batch_norm(conv_pool_2)
# 第三个卷积-池化层
conv_pool_3 = fluid.nets.simple_img_conv_pool(
   input=conv_pool_2,
   filter_size=5,
   num_filters=50,
   pool_size=2,
   pool_stride=2,
   act="relu")
# 以softmax为激活函数的全连接输出层，10类数据输出10个数字
prediction = fluid.layers.fc(input=conv_pool_3, size=12, act='softmax')
return prediction
```
图像预处理为100*100，训练轮数为20轮，adam学习为0.1，结构模型仍是cnn，

第二次测试如下：
Learning_Rate=0.0001
EPOCH_NUM = 50

此时模型的测试准确率有0.43

![图片4.png](http://www.clzly.xyz:8080/2020/05/python/87370e1f/%E5%9B%BE%E7%89%874.png)

4.3 项目结果
最后的我们的准确率达到了94%-95%之间。获得比赛中的第一名。

### 5 项目总结
主要问题：
怎样优化模型使精确度更高
解决办法：
1.增加更多数据
2.特征选择：可视化
3.尝试使用多种算法
4.集成模型：使用了Lenet-5，alexNet，resnet，resnext等多种模型

主要问题：
在后面数据量不够了。
增加了图形加强的部分，对图片进行随机的裁剪，调色度，调角度。

Hmily 发表于 2020-5-7 14:15

I D：君匡
邮箱：1533859388@qq.com

申请通过，欢迎光临吾爱破解论坛，期待吾爱破解有你更加精彩，ID和密码自己通过邮件密码找回功能修改，请即时登陆并修改密码！
登陆后请在一周内在此帖报道，否则将删除ID信息。

发表于 2020-5-30 20:50

邮箱收不到验证码，可以更换吗
邮件不在垃圾邮件和广告邮件中

Hmily 发表于 2020-6-1 16:09

游客 223.116.17.x 发表于 2020-5-30 20:50
邮箱收不到验证码，可以更换吗
邮件不在垃圾邮件和广告邮件中

确认邮件已经发出去了，如果你邮箱没写错的话，自己按这个帖子排查处理https://www.52pojie.cn/thread-98585-1-1.html

发表于 2020-6-4 21:52

Hmily 发表于 2020-6-1 16:09
确认邮件已经发出去了，如果你邮箱没写错的话，自己按这个帖子排查处理https://www.52pojie.cn/thread-98 ...

抱歉，没用的

Hmily 发表于 2020-6-5 10:55

游客 103.3.137.x 发表于 2020-6-4 21:52
抱歉，没用的

那只能说无缘了。

5月8号和25号分别找回过，都已经邮件发送到了，只有2种原因：1、邮箱写错了。2、自己邮箱相关设置或者功能阻止收件，我上面链接已经给你发了，作为排查方法。

Hmily 发表于 2020-6-18 15:49

一直未报到，账号注销。（看账号都已经等了发帖了还不来报道）

页: [1]

吾爱破解 - 52pojie.cn's Archiver

申请会员ID：君匡【未报到，已注销】