liulan 发表于 2020-12-19 14:38

python 批量提取excel 指定时间段的数据

            
   
import pandas as pd
import os
import datetime


path = r'C:\Users\plm\Desktop\text1'
files = os.listdir(path)



for file in files:
    df =pd.read_excel(file)


    index = (df.iloc[(df['datetime2'] > 2020-11-2)].index)
    ndf = df.drop(index)
    ndf.to_excel(file)               
   
   

liulan 发表于 2020-12-19 14:48

按照已有的模板,我修改了出了上面的代码,但是总是报错,新手小白不太懂怎么修改,请问大家这个怎么修改?

DevenVan 发表于 2020-12-19 15:38

最好把报错截个图,就上图而言,(df['datetime2'] > 2020-11-2) 这里应该是有问题的,猜测应该是 (df['datetime2'] > '2020-11-2') 或者 (df['datetime2'] > datetime.date(2020,11,2))

liulan 发表于 2020-12-19 15:50

DevenVan 发表于 2020-12-19 15:38
最好把报错截个图,就上图而言,(df['datetime2'] > 2020-11-2) 这里应该是有问题的,猜测应该是 (df['date ...


import pandas as pd
import os
import datetime


path = r'C:\Users\plm\Desktop\text1'
files = os.listdir(path)



for file in files:
    df =pd.read_excel(file)
Traceback (most recent call last):

File "<ipython-input-44-eaa09722c597>", line 12, in <module>
    df =pd.read_excel(file)

File "D:\ProgramData\Anaconda3\lib\site-packages\pandas\util\_decorators.py", line 296, in wrapper
    return func(*args, **kwargs)

File "D:\ProgramData\Anaconda3\lib\site-packages\pandas\io\excel\_base.py", line 304, in read_excel
    io = ExcelFile(io, engine=engine)

File "D:\ProgramData\Anaconda3\lib\site-packages\pandas\io\excel\_base.py", line 867, in __init__
    self._reader = self._engines(self._io)

File "D:\ProgramData\Anaconda3\lib\site-packages\pandas\io\excel\_xlrd.py", line 22, in __init__
    super().__init__(filepath_or_buffer)

File "D:\ProgramData\Anaconda3\lib\site-packages\pandas\io\excel\_base.py", line 353, in __init__
    self.book = self.load_workbook(filepath_or_buffer)

File "D:\ProgramData\Anaconda3\lib\site-packages\pandas\io\excel\_xlrd.py", line 37, in load_workbook
    return open_workbook(filepath_or_buffer)

File "D:\ProgramData\Anaconda3\lib\site-packages\xlrd\__init__.py", line 111, in open_workbook
    with open(filename, "rb") as f:

FileNotFoundError: No such file or directory: 'a.xlsx'


index = (df.iloc[(df['datetime2'] > 2020-11-2)].index)
Traceback (most recent call last):

File "<ipython-input-45-f3d918f4f26d>", line 1, in <module>
    index = (df.iloc[(df['datetime2'] > 2020-11-2)].index)

NameError: name 'df' is not defined


ndf = df.drop(index)
Traceback (most recent call last):

File "<ipython-input-46-411ce523224e>", line 1, in <module>
    ndf = df.drop(index)

NameError: name 'df' is not defined


ndf.to_excel(file)               
Traceback (most recent call last):

File "<ipython-input-47-5c94f9f205b4>", line 1, in <module>
    ndf.to_excel(file)

NameError: name 'ndf' is not defined


ndf = df.drop(index)
Traceback (most recent call last):

File "<ipython-input-48-411ce523224e>", line 1, in <module>
    ndf = df.drop(index)

NameError: name 'df' is not defined

liulan 发表于 2020-12-19 15:51

liulan 发表于 2020-12-19 15:50

import pandas as pd
import os


错误还是很多的{:1_907:}

liulan 发表于 2020-12-19 15:58

这两张图片更容易看出来

wanwfy 发表于 2020-12-19 16:18

看错误提示
1个是文件a.xlsx未找到,一个是df未定义

一步一步的打印结果都可以测试出来是什么结果

files = os.listdir(path)
print(files)

for file in files:
    # 真实的文件路径
    file_path = f"{path}\\file"
    df = pd.read_excel(file_path)

twm74110 发表于 2020-12-19 16:19

df =pd.read_excel(path + “/”+file)

ciker_li 发表于 2020-12-19 16:21

2020-11-2不是正常的Excel日期格式吧?
应该是2020/11/2吧?

twm74110 发表于 2020-12-19 16:22

你这看起来是临时学的,编程基础不太好{:17_1068:}
页: [1] 2
查看完整版本: python 批量提取excel 指定时间段的数据