用发票金额+发票日期+销售方名称+发票号码重命名发票文件增强版

lzmomo · 发表于 2024-8-28 11:04

之前写了一个用发票金额重命名pdf发票文件
https://www.52pojie.cn/thread-1952842-1-1.html
有些朋友留言要把发票日期，销售方名称，发票号码也加进去，所以改了一下

用发票金额重命名pdf发票文件增强版
用发票金额+发票日期+销售方名称+发票号码重命名发票文件，在发票文件的目录下会新建一个‘重命名文件’的文件夹，存放修改后的文件。
代码依然简陋不规范，但好在能运行
步骤：运行程序，选择PDF文档，选择发票。

代码如下：

[Python] 纯文本查看 复制代码

001

002

003

004

005

006

007

008

009

010

011

012

013

014

015

016

017

018

019

020

021

022

023

024

025

026

027

028

029

030

031

032

033

034

035

036

037

038

039

040

041

042

043

044

045

046

047

048

049

050

051

052

053

054

055

056

057

058

059

060

061

062

063

064

065

066

067

068

069

070

071

072

073

074

075

076

077

078

079

080

081

082

083

084

085

086

087

088

089

090

091

092

093

094

095

096

097

098

099

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

import pdfplumber
import re
import tkinter as tk
from tkinter import filedialog
import threading
import shutil
import os
import configparser
 
# 新线程的工作函数
def worker(arg):
 
    # 创建ConfigParser对象
    config = configparser.ConfigParser()
    # 读取INI文件
    config.read('config.ini')
    xuhao = config.get('RunCount1', 'count1')
    oldfile = config.get('RunCount2', 'count2')
    oldfile_data = os.path.dirname(oldfile)
 
    money = config.get('RunCount2', '金额')
    fapiaoriqi = config.get('RunCount2', '发票日期')
    xiaomingcheng_string = config.get('RunCount2', '销名称')
    fapiaohaoma = config.get('RunCount2', '发票号码')
 
    newfile = oldfile_data + '/重命名文件/' + money + '_' + fapiaohaoma + '_' + fapiaoriqi + '_' + xiaomingcheng_string + '(' + xuhao + ').pdf'
    # 检查重命名文件文件夹是否存在
    if not os.path.exists(os.path.join(oldfile_data, '重命名文件')):
        # 如果不存在，就创建重命名文件文件夹
        os.makedirs(os.path.join(oldfile_data, '重命名文件'))
        print('重命名文件夹创建成功')
 
    shutil.move(oldfile, newfile)   # move是移动，如果只想复制可以更换为copy
 
 
def main():
 
 
    filetypes = (
        ('pdf文件', '*.pdf'),
        ('所有文件', '*.*')
    )
 
    filename = filedialog.askopenfilename(
        title='选择要打开的文件',
        initialdir='/',
        filetypes=filetypes)
 
    if filename:
 
        # 读取PDF文档
        with pdfplumber.open(filename) as pdf:
            # 获取文档的总页数
            total_pages = len(pdf.pages)
 
            # 遍历每一页
            for page_number in range(total_pages):
                # 获取当前页
                page = pdf.pages[page_number]
 
                # 提取文本内容
                text = page.extract_text()
 
                pattern_jine = r'（小写）&#165;\d+\.\d+'
                # 使用re.search 方法提取匹配的内容
                match = re.search(pattern_jine, text)
                money = match.group()
                text_money = money
                pattern_jine = r"(\d+(\.\d+)?)"
                result = re.search(pattern_jine, text_money)
                extracted_amount = result.group()
                money = extracted_amount
 
                pattern_fapiaohaoma = r'\b\d{20}\b'
                matches_fapiaohaoma = re.findall(pattern_fapiaohaoma, text)
                result_fapiaohaoma = ''.join(matches_fapiaohaoma)
 
                pattern_fapiaoriqi = r'(\d{4})年(\d{2})月(\d{2})日'
                matches_fapiaoriqi = re.search(pattern_fapiaoriqi,  text)
                fapiaoriqi = matches_fapiaoriqi.group()
 
                pattern_xiaomingcheng = r'销 名称：(.+)'
                match_xiaomingcheng = re.search(pattern_xiaomingcheng, text)
                xiaomingcheng = match_xiaomingcheng.group()
                xiaomingcheng_string = re.sub(r'\s+', '', xiaomingcheng)
 
                # 检查是否找到匹配项并打印结果
                if match:
                    # 要写入的新内容
                    new_lines = [filename, money]
 
                    # 创建配置文件对象
                    config = configparser.ConfigParser()
 
                    # 设置运行次数的节和键
                    run_count_section = 'RunCount1'
                    run_count_key = 'count1'
 
                    run_count_section_1 = 'RunCount2'
                    run_count_key_1 = 'count2'
 
                    # 检查配置文件是否存在，如果不存在则创建
                    config_file = 'config.ini'
                    if not config.read(config_file, encoding='ANSI'):
                        config[run_count_section] = {run_count_key: '0'}
 
                        with open(config_file, 'w', encoding='ANSI') as configfile:
                            config.write(configfile)
 
                    config[run_count_section_1] = {run_count_key_1: filename,
                                                   '金额': money,
                                                   '发票号码': result_fapiaohaoma,
                                                   '发票日期': fapiaoriqi,
                                                   '销名称': xiaomingcheng_string
                                                   }
 
                    # 读取运行次数
                    current_count = config.getint(run_count_section, run_count_key)
 
                    # 递增运行次数
                    new_count = current_count + 1
                    config.set(run_count_section, run_count_key, str(new_count))
 
                    # 保存配置文件
                    with open(config_file, 'w', encoding='ANSI') as configfile:
                        config.write(configfile)
 
                    print(f"重命名的第: {new_count}张发票", '金额：', money)
 
                    section1_values_1 = config['RunCount1']
 
                    section1_values_2 = config['RunCount2']
 
                    list_data = ['发票信息',
                                 section1_values_2['count2'],
                                 os.path.dirname(section1_values_2['count2']),
                                 '/',
                                 section1_values_2['金额'],
                                 '(',
                                 section1_values_1['count1'],
                                 ').pdf']
 
                    # 主线程中创建新线程，并传递参数
                    t = threading.Thread(target=worker, args=(list_data,))
                    t.start()
                else:
                    print("未找到匹配项")
 
 
# 程序入口
if __name__ == "__main__":
    root = tk.Tk()
    root.title("文件上传")
    root.geometry('300x150')
 
    upload_button = tk.Button(root, text='选择PDF文档', command=main)
    upload_button.pack(expand=True)
 
    root.mainloop()

justfly99 · 发表于 2024-8-29 14:20

Yathon 发表于 2024-8-28 11:42
很实用，赞一个！

希望加上按照路径遍历重命名，如果能识别是否是发票文件就更好了。

满足你，https://www.52pojie.cn/thread-1959322-1-1.html

Yathon · 发表于 2024-8-28 11:42

很实用，赞一个！

希望加上按照路径遍历重命名，如果能识别是否是发票文件就更好了。

Ditto · 发表于 2024-8-28 11:16

很实用感谢分享

xiaoxi9826 · 发表于 2024-8-28 11:17

楼主大大，能出一个成品嘛，感谢感谢！！！！！

chenfengxiangyu · 发表于 2024-8-28 11:39

好东西，西安收藏了

额微粒波地 · 发表于 2024-8-28 11:39

报错啦，是不支持这种发票吗？能修复一下吗？

桃白白123 · 发表于 2024-8-28 11:53

很实用的工具，希望能发一个成品

zylz9941 · 发表于 2024-8-28 13:00

菜鸟期待成品

zylz9941 · 发表于 2024-8-28 13:01

Yathon 发表于 2024-8-28 11:42
很实用，赞一个！

希望加上按照路径遍历重命名，如果能识别是否是发票文件就更好了。

好建议，希望楼主更新一下

于生 · 发表于 2024-8-28 13:59

这个确实很实用，望楼主能分享成品。另外，请问有能识别PDF某些内容自动重命名的工具吗？

帐号		自动登录	找回密码
密码			注册[Register]

[Python 原创] 用发票金额+发票日期+销售方名称+发票号码重命名发票文件增强版

免费评分

本帖被以下淘专辑推荐:

免费评分

浏览过的版块