我智商不够,大师有办法优化?
我没有用过这个格式,但是比如你这底下的一堆
with open(xml_path10, "r", encoding="utf-8") as f:
_text10 = f.read()
tree10 = xmltodict.parse(_text10)
for row10 in tree10['ofd:Page']['ofd:Content']['ofd:Layer']['ofd:TextObject']:
data_dict10['ofd:TextCode'] = row10['ofd:TextCode'].get('#text')
# return data_dict
for i10 in range(0, 29939):
eachword10 = tree10['ofd:Page']['ofd:Content']['ofd:Layer']['ofd:TextObject'][
'ofd:TextCode'].get(
'#text')
with open(os.getcwd() + '/ofdtxt/09.txt', "a") as f:
f.write(' ' + eachword10)
类似的代码,不可以用循环包起来吗?比如
for num in range(1, 36):
xml_path = f"{file_path}/Doc_0/Pages/Page_{num}/Content.xml"
try:
with open(xml_path, "r", encoding="utf-8") as f:
_text = f.read()
tree = xmltodict.parse(_text)
for row10 in tree['ofd:Page']['ofd:Content']['ofd:Layer']['ofd:TextObject']:
data_dict10['ofd:TextCode'] = row10['ofd:TextCode'].get('#text')
# return data_dict
for i10 in range(0, 29939):
eachword = tree['ofd:Page']['ofd:Content']['ofd:Layer']['ofd:TextObject'][
'ofd:TextCode'].get(
'#text')
with open(os.getcwd() + f'/ofdtxt/{num}.txt', "a") as f:
f.write(' ' + eachword)
except:
pass 代码用代码框处理一下吧。 其实OFD就是压缩包,打开里面就是xml 这下面不能封装成函数吗?都是重复的代码 之前发的帖子被删除了也没没看到
页:
[1]
2