求救这里 Image.open(imgbuffer) 为什么导致卡死

858983646 发表于 2024-9-20 17:24

运行到这里，要是png是颜色索引的必然会cpu单核跑满，一直不停，换成普通png就正常，求教哪里出问题了，弄了半天解决不了
pngquant地址https://pngquant.org/pngquant-windows.zip

import os
from pypdf import PdfReader, PdfWriter
from tqdm import tqdm
from PIL import Image
import numpy as np
from sklearn.cluster import KMeans
import subprocess

# 定义一个函数，用于减少图像的颜色数量
def reduce_colors(im, k=4):
# 将图像转换为数组，并重塑为(-1, 3)的形状，表示每个像素的RGB值
image_data = np.array(im).reshape(-1, 3)
# 使用KMeans算法对颜色进行聚类，k为聚类的数量
kmeans = KMeans(n_clusters=k, random_state=0).fit(image_data)
# 获取聚类中心，即新的颜色
centers = kmeans.cluster_centers_
# 将图像数据量化为聚类中心的颜色
quantized_data = centers
# 将量化后的数据重塑为原始图像的形状
quantized_image = quantized_data.reshape(im.height, im.width, 3)
# 将量化后的数组转换回图像
quantized_image = Image.fromarray(quantized_image.astype('uint8'))
# 保存量化后的图像到临时文件
imgbuffer = 'temp_image.png'
quantized_image.save(imgbuffer, format="PNG", optimize=True)
# 调用外部程序pngquant来进一步压缩图像
path = 'pngquant.exe'
cmd = f'"{path}" -o temp_image.png --force --quality=0-1 --verbose {k} temp_image.png'
try:
   subprocess.run(cmd, shell=True, check=True)
except subprocess.CalledProcessError as e:
   print(f"Error executing pngquant: {e}")
   return None
# 重新打开压缩后的图像
imagexx = Image.open(imgbuffer)
print("这是return前面一行")
return imagexx

# 定义一个函数，用于处理PDF文件中的图像
def process_pdf(pdf_file):
try:
   # 读取PDF文件
   reader = PdfReader(pdf_file)
   # 创建PDF写入器
   writer = PdfWriter()
   # 遍历PDF的每一页
   for page in tqdm(reader.pages, desc=f"Processing {pdf_file}"):
         # 将每一页添加到写入器中
         writer.add_page(page)
   # 再次遍历每一页，处理图像
   for page in tqdm(writer.pages, desc=f"Binarizing images {pdf_file}"):
         for img in page.images:
            # 替换页面中的图像为减少颜色后的图像
            img.replace(reduce_colors(img.image))
            print("这是reductcolors后面")
   # 定义输出文件的名称
   output_file = f"reduced_{os.path.splitext(pdf_file)}.pdf"
   # 将处理后的PDF写入文件
   with open(output_file, "wb") as f:
         writer.write(f)
   # 输出处理完成的信息
   print(f"Processed file saved as {output_file}")
except Exception as e:
   # 如果处理过程中出现异常，打印错误信息
   print(f"处理文件 {pdf_file} 时发生错误：{e}")

# 定义主函数，用于处理当前目录下的所有PDF文件
def main():
# 获取当前目录下所有PDF文件的列表
pdf_files =
# 如果没有找到PDF文件，打印信息并返回
if not pdf_files:
   print("当前目录下没有找到PDF文件。")
   return
# 遍历PDF文件列表，逐个处理
for pdf_file in pdf_files:
   process_pdf(pdf_file)

# 如果这个脚本是作为主程序运行，执行main函数
if __name__ == "__main__":
main()

chenzhigang 发表于 2024-9-20 18:37

你的意思是带颜色索引的png 经过pngquant-window处理后，用image.open 打开会卡住是吧

858983646 发表于 2024-9-20 18:46

chenzhigang 发表于 2024-9-20 18:37
你的意思是带颜色索引的png 经过pngquant-window处理后，用image.open 打开会卡住是吧

不是，pngquant输出的是颜色索引的，会卡死，如果不pngquant或者转回rgb就没事

858983646 发表于 2024-9-20 18:50

本帖最后由 858983646 于 2024-9-20 18:52 编辑

chenzhigang 发表于 2024-9-20 18:37
你的意思是带颜色索引的png 经过pngquant-window处理后，用image.open 打开会卡住是吧
貌似不是image.open卡死，而是return卡死

helian147 发表于 2024-9-20 18:56

本帖最后由 helian147 于 2024-9-20 18:57 编辑

没用过pngquant.exe

helian147 发表于 2024-9-20 18:58

858983646 发表于 2024-9-20 18:46
不是，pngquant输出的是颜色索引的，会卡死，如果不pngquant或者转回rgb就没事

那按转回rgb不就可以了？
PIL的convert函数

858983646 发表于 2024-9-20 19:02

helian147 发表于 2024-9-20 18:56
没用过pngquant.exe

谢谢，并不是pngquant的问题，pngquant正常输出颜色索引的图片
即使我把pngquant部分去掉，拿一个颜色索引的png图片image.open打开后return还是会卡死
你这样转换颜色模式我试过是可以的，但是大小就变大好多了

858983646 发表于 2024-9-20 19:03

helian147 发表于 2024-9-20 18:58
那按转回rgb不就可以了？
PIL的convert函数

是的，但是我目的是减小图片体积，这样转换了就没用了

helian147 发表于 2024-9-20 19:10

858983646 发表于 2024-9-20 19:02
谢谢，并不是pngquant的问题，pngquant正常输出颜色索引的图片
即使我把pngquant部分去掉，拿一个颜色索 ...
试一下：

img.replace(reduce_colors(img.image), quality=75)

helian147 发表于 2024-9-20 19:15

858983646 发表于 2024-9-20 19:03
是的，但是我目的是减小图片体积，这样转换了就没用了

def reduce_colors(im, k=4):
# 将图像转换为数组，并重塑为(-1, 3)的形状，表示每个像素的RGB值
mode = im.mode
if mode != 'RGBA':
im = im.convert('RGBA')

.......

# 重新打开压缩后的图像
imagexx = Image.open(imgbuffer)
imagexx = imagexx.convert(mode)
print("这是return前面一行")
return imagexx

def process_pdf(pdf_file):
.......
img.replace(reduce_colors(img.image), quality=75)

页: [1] 2

吾爱破解 - 52pojie.cn's Archiver

求救这里 Image.open(imgbuffer) 为什么导致卡死