代码背景:请看帖子——悦读”PDF的另类下载方式https://www.52pojie.cn/forum.php?mod=viewthread&tid=1117859&extra=page%3D1%26filter%3Dtypeid%26typeid%3D202
分析:裁剪的长图在每个书页之间都有一条分隔符,分隔符的颜色是固定,因此遍历一列上(代码取第10列)的所有像素点,找到分隔符的颜色(RGB中的B值为186)对应的坐标,然后根据坐标实现自动裁剪。
具体代码如下:
import os
import cv2
def cut(start_y, end_y, width, number):
save_path = "./save/" + image[0] + str(number) + ".png"
page = img[start_y:end_y, 0:width]
cv2.imwrite(save_path, page)
for image in os.listdir("./Book"):
img_path = "./Book/" + image
img = cv2.imread(img_path)
height = img.shape[0]
width = img.shape[1]
point = [point for point in range(0, height) if img.item(point, 10, 0) == 186]
page_number = len(point) // 2 + 1
cut(0, point[0], width, 1)
for p in range(2, page_number):
cut(point[p * 2 - 3], point[p * 2 - 2], width, p)
cut(point[-1], height, width, page_number)
|