py笔记-python的性能分析

天域至尊 发表于 2022-1-4 18:33

本帖最后由天域至尊于 2022-1-4 18:33 编辑

本文为学习《python高性能编程》一书的学习笔记，如需详情，建议参考此书。
本文及以后所有文章的代码，均适用于python 3.x，《python高性能编程》书中，仅适用于python 2.x，请注意！

第二章：代码性能分析
预知如何优化代码，需先知如何评价代码好坏。

一、使用修饰器监控函数执行时间
使用棉花糖修饰器来监控执行时间。
棉花糖可以在将被执行函数包裹在自己内部，从而可以实现运行时间监控，权限判断等等。
代码：
            import time
            import random
            from functools import wraps
            #编写装饰器
            def monitor_time(a_func):
               #使函数名、注释等沿用被装饰函数的
               @wraps(a_func)
               def wrapTheFunction(*args, **kwargs):
                     #获得开始时间
                     start_time=time.time()
                     #运行函数
                     f=a_func(*args, **kwargs)
                     #获得结束时间
                     end_time=time.time()
                     #打印提示信息
                     print("函数运行时间为%f秒"%(end_time-start_time,))
                     #返回函数返回结果
                     return f
               return wrapTheFunction
            #增加装饰器
            @monitor_time
            def run(word:str)->str:
               """
               该函数会随机休眠时间
               该函数要求输入内容，是为了模拟输入输出
               """
               #获得一个十秒内随机休眠的函数，模拟程序运行耗时
               sleep_time=int(random.random()*10)
               #打印提示信息
               print("开始休眠%d秒。"%(sleep_time,))
               #进行休眠
               time.sleep(sleep_time)
               #返回输入内容
               return word
            #测试运行
            word=run("你好呀")
            print(word)

返回结果：
>python3 test.py
开始休眠3秒。
函数运行时间为3.007004秒
你好呀

二、使用unix 系统模块计算耗时
使用方法：/usr/bin/time -p python3 脚本名称
> /usr/bin/time -p python3 -c "import time; time.sleep(1)"
   real 1.01
   user 0.00
   sys 0.00
Real为整体耗时
User 为cpu耗时
Sys 为内核函数耗时
Real-user-sys=等待cpu队列耗时+IO耗时等等

加上参数 --verbose打印详细信息
/usr/bin/time --verbose python3 脚本名称
>/usr/bin/time --verbose python3 -c "import time;time.sleep(3)"
            Command being timed: "python3 -c import time;time.sleep(3)"
            User time (seconds): 0.01
            System time (seconds): 0.00
            Percent of CPU this job got: 0%
            Elapsed (wall clock) time (h:mm:ss or m:ss): 0:03.01
            Average shared text size (kbytes): 0
            Average unshared data size (kbytes): 0
            Average stack size (kbytes): 0
            Average total size (kbytes): 0
            Maximum resident set size (kbytes): 5352
            Average resident set size (kbytes): 0
            Major (requiring I/O) page faults: 0
            Minor (reclaiming a frame) page faults: 1486
            Voluntary context switches: 2
            Involuntary context switches: 1
            Swaps: 0
            File system inputs: 0
            File system outputs: 0
            Socket messages sent: 0
            Socket messages received: 0
            Signals delivered: 0
            Page size (bytes): 4096
            Exit status: 0
其中较为重要的数据有 Major (requiring I/O)数据，其反映了因“内存缺页”而导致的开销时间。（内存缺页：因在内存中未能找到该数据，故而需要从磁盘中将这部分数据导入内存中，这影响了程序运行速度，如果频繁发生内存缺页，会拖慢程序运行速度。）

三、使用cProfile模块分析耗时
使用方法：python3 -m cProfile 脚本名
常用参数：-s cumulative 按照cumulative排序
样例：
   测试脚本：
            import random
            def get_random_str(num:int)->str:
               """
               该函数用于生成一个随机字符串，num为循环次数
               """
               end_word=""
               for i in range(num):
                     end_word=end_word+str(random.random()*100)+str(i)
               return end_word
            def check_num(num:int,str_data:str)->int:
               """
               检查num存在于字符串中的次数
               """
               num=str(num)
               all_num=0
               for i in str_data:
                     if i==num:
                        all_num=all_num+1
               return all_num
            #循环100000次，生成随机字符串
            num=100000
            #获得生成的随机字符串
            string=get_random_str(num=num)
            #声明检查的数字
            check_num_data=3
            #循环，依次检查该数字，确定出现次数
            all_num=check_num(num=check_num_data,str_data=string)
            #输出结果
            print("共出现%d次"%(all_num,))
   输出样例：
            >python3 -m cProfile -s cumulative test3.py
            共出现213993次
                     101349 function calls (101322 primitive calls) in 18.307 seconds

               Ordered by: cumulative time

               ncallstottimepercallcumtimepercall filename:lineno(function)
                  3/1 0.000 0.000 18.307 18.307 {built-in method builtins.exec}
                     1 0.000 0.000 18.307 18.307 test3.py:1(<module>)
                     1 18.215 18.215 18.233 18.233 test3.py:3(get_random_str)
                     1 0.072 0.072 0.072 0.072 test3.py:12(check_num)
               100000 0.018 0.000 0.018 0.000 {method 'random' of '_random.Random' objects}
                  6/1 0.000 0.000 0.002 0.002 <frozen importlib._bootstrap>:986(_find_and_load)
                  6/1 0.000 0.000 0.002 0.002 <frozen importlib._bootstrap>:956(_find_and_load_unlocked)
                  6/1 0.000 0.000 0.001 0.001 <frozen importlib._bootstrap>:650(_load_unlocked)
                  2/1 0.000 0.000 0.001 0.001 <frozen importlib._bootstrap_external>:837(exec_module)
                  10/1 0.000 0.000 0.001 0.001 <frozen importlib._bootstrap>:211(_call_with_frames_removed)
                     1 0.000 0.000 0.001 0.001 random.py:1(<module>)
                     6 0.000 0.000 0.001 0.000 <frozen importlib._bootstrap>:890(_find_spec)
                     6 0.000 0.000 0.001 0.000 <frozen importlib._bootstrap_external>:1399(find_spec)
                     6 0.000 0.000 0.001 0.000 <frozen importlib._bootstrap_external>:1367(_get_spec)
                  22 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:1498(find_spec)
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:549(module_from_spec)
                     2 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:909(get_code)
                     4 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:1164(create_module)
                     4 0.000 0.000 0.000 0.000 {built-in method _imp.create_dynamic}
                     1 0.000 0.000 0.000 0.000 bisect.py:1(<module>)
                     1 0.000 0.000 0.000 0.000 random.py:94(__init__)
                     1 0.000 0.000 0.000 0.000 random.py:123(seed)
                     1 0.000 0.000 0.000 0.000 {function Random.seed at 0x7fca33107790}
                  96 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:121(_path_join)
                     2 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:638(_compile_bytecode)
                     2 0.000 0.000 0.000 0.000 {built-in method marshal.loads}
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:477(_init_module_attrs)
                  30 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:135(_path_stat)
                  96 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:123(<listcomp>)
                  30 0.000 0.000 0.000 0.000 {built-in method posix.stat}
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:147(__enter__)
                     1 0.000 0.000 0.000 0.000 {built-in method builtins.print}
                     2 0.000 0.000 0.000 0.000 {built-in method builtins.__build_class__}
                     2 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:1029(get_data)
                     8 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:376(cached)
                     4 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:354(cache_from_source)
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:1493(_get_spec)
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:484(_get_cached)
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:157(_get_module_lock)
                  28 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:1330(_path_importer_cache)
                  110 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:222(_verbose_message)
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:154(_path_isfile)
                  36 0.000 0.000 0.000 0.000 {built-in method builtins.getattr}
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:145(_path_is_mode_type)
                     2 0.000 0.000 0.000 0.000 {built-in method io.open_code}
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:151(__exit__)
                  196 0.000 0.000 0.000 0.000 {method 'rstrip' of 'str' objects}
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:689(spec_from_file_location)
                  100 0.000 0.000 0.000 0.000 {method 'join' of 'str' objects}
                     4 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:127(_path_split)
                     6 0.000 0.000 0.000 0.000 {built-in method posix.getcwd}
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:103(release)
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:78(acquire)
                     6 0.000 0.000 0.000 0.000 {method 'pop' of 'dict' objects}
                     2 0.000 0.000 0.000 0.000 {method 'read' of '_io.BufferedReader' objects}
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:58(__init__)
                     4 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:1172(exec_module)
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:176(cb)
                     2 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:553(_classify_pyc)
                     4 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:1148(__init__)
                     4 0.000 0.000 0.000 0.000 {built-in method builtins.max}
                  18 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:867(__exit__)
                  18 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:863(__enter__)
                     1 0.000 0.000 0.000 0.000 random.py:78(Random)
                  35 0.000 0.000 0.000 0.000 {built-in method builtins.hasattr}
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:725(find_spec)
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:79(_unpack_uint32)
                     2 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:1070(path_stats)
                  38 0.000 0.000 0.000 0.000 {method 'rpartition' of 'str' objects}
                     2 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:586(_validate_timestamp_pyc)
                     4 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:175(_path_isabs)
                     1 0.000 0.000 0.000 0.000 {built-in method math.exp}
                     8 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:129(<genexpr>)
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:389(parent)
                  33 0.000 0.000 0.000 0.000 {built-in method builtins.isinstance}
                     1 0.000 0.000 0.000 0.000 {built-in method posix.register_at_fork}
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:800(find_spec)
                  10 0.000 0.000 0.000 0.000 {method 'endswith' of 'str' objects}
                  30 0.000 0.000 0.000 0.000 {built-in method _imp.acquire_lock}
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:342(__init__)
                  30 0.000 0.000 0.000 0.000 {built-in method _imp.release_lock}
                  22 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:68(_relax_case)
                     6 0.000 0.000 0.000 0.000 {built-in method _imp.is_builtin}
                     4 0.000 0.000 0.000 0.000 {method 'startswith' of 'str' objects}
                  12 0.000 0.000 0.000 0.000 {built-in method _thread.allocate_lock}
                     2 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:516(_check_name_wrapper)
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:143(__init__)
                     2 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:35(_new_module)
                     2 0.000 0.000 0.000 0.000 {built-in method math.log}
                     1 0.000 0.000 0.000 0.000 random.py:709(SystemRandom)
                  12 0.000 0.000 0.000 0.000 {method 'get' of 'dict' objects}
                  12 0.000 0.000 0.000 0.000 {built-in method _thread.get_ident}
                     1 0.000 0.000 0.000 0.000 random.py:103(__init_subclass__)
                     4 0.000 0.000 0.000 0.000 {method 'rfind' of 'str' objects}
                     6 0.000 0.000 0.000 0.000 {built-in method from_bytes}
                     2 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:999(__init__)
                     6 0.000 0.000 0.000 0.000 {built-in method _imp.is_frozen}
                     4 0.000 0.000 0.000 0.000 {built-in method _imp.exec_dynamic}
                     6 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap>:397(has_location)
                     8 0.000 0.000 0.000 0.000 {built-in method builtins.len}
                  10 0.000 0.000 0.000 0.000 {built-in method posix.fspath}
                     1 0.000 0.000 0.000 0.000 {method 'disable' of '_lsprof.Profiler' objects}
                     2 0.000 0.000 0.000 0.000 {built-in method _imp._fix_co_filename}
                     1 0.000 0.000 0.000 0.000 {built-in method math.sqrt}
                     2 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:1024(get_filename)
                     2 0.000 0.000 0.000 0.000 <frozen importlib._bootstrap_external>:834(create_module)
   解释：
         ncalls：函数被调用的次数。如果这一列有两个值，就表示有递归调用，第二个值是原生调用次数，第一个值是总调用次数。
         tottime：函数内部消耗的总时间。（可以帮助优化）
         percall：是tottime除以ncalls，一个函数每次调用平均消耗时间。
         cumtime：之前所有子函数消费时间的累计和。
         filename:lineno(function)：被分析函数所在文件名、行号、函数名。
   该模块只能定义到函数，无法查看每行耗时情况。

四、使用line_profiler对函数进行逐行分析
安装命令：pip3 install line_profiler
特点：可以对CPU占用进行分析，无法对内存占用进行分析
使用方法：在函数上增加装饰器 @profile
         然后使用kernprof运行python脚本
      kernprof -l -v python脚本名
         示例：
            代码：
import random
#增加监控装饰器
@profile
def get_random_str(num:int)->str:
   """
   该函数用于生成一个随机字符串，num为循环次数
   """
   end_word=""
   for i in range(num):
            end_word=end_word+str(random.random()*100)+str(i)
   return end_word
def check_num(num:int,str_data:str)->int:
   """
   检查num存在于字符串中的次数
   """
   num=str(num)
   all_num=0
   for i in str_data:
            if i==num:
                     all_num=all_num+1
   return all_num
#循环100000次，生成随机字符串
num=100000
#获得生成的随机字符串
string=get_random_str(num=num)
#声明检查的数字
check_num_data=3
#循环，依次检查该数字，确定出现次数
all_num=check_num(num=check_num_data,str_data=string)
#输出结果
print("共出现%d次"%(all_num,))
         输出：
> kernprof -l -v test3.py
   共出现213355次
   Wrote profile results to test3.py.lprof
   Timer unit: 1e-06 s

   Total time: 17.5578 s
   File: test3.py
   Function: get_random_str at line 3

   Line #    Hits       TimePer Hit % TimeLine Contents
   ==============================================================
            3                                        @profile
            4                                        def get_random_str(num:int)->str:
            5                                           """
            6                                           该函数用于生成一个随机字符串，num为循环次数
            7                                           """
            8       1       2.0    2.0    0.0    end_word=""
            9 100001    63935.0    0.6    0.4    for i in range(num):
            10 100000 17493876.0 174.9 99.6       end_word=end_word+str(random.random()*100)+str(i)
            11       1       1.0    1.0    0.0    return end_word
            可以看到第10行耗时最长

五、使用memory_profiler对内存占用进行分析
安装命令：pip3 install memory_profiler psutil
使用方式：在函数上增加装饰器 @profile
         1.直接在屏幕打印结果
         python3 -m memory_profiler 脚本名
         2.将结果输出到文件，然后绘图
            需要安装 pip3 install matplotlib
            运行脚本：mprof run 脚本名
                     此时，会在运行目录生成bat文件，里面存储运行时内存占用数据
                     mprof plot
                     生成并打开图片，无需指定bat文件名，会自动寻找
                     mprof clean
                     清空所有bat文件
         1.样例：
            程序：
               同标题四脚本
            输出：
> python3 -m memory_profiler test3.py
共出现213753次
Filename: test3.py

Line # Mem usage IncrementOccurrences Line Contents
=============================================================
      3 34.938 MiB 34.938 MiB       1 @profile
      4                                     def get_random_str(num:int)->str:
      5                                           """
      6                                           该函数用于生成一个随机字符串，num为循环次数
      7                                           """
      8 34.938 MiB 0.000 MiB       1    end_word=""
      9 41.273 MiB -96312.609 MiB    100001    for i in range(num):
   10 41.273 MiB -96306.273 MiB    100000       end_word=end_word+str(random.random()*100)+str(i)
   11 39.086 MiB -2.188 MiB       1    return end_word
               解释：
                     Mem usage: 内存占用情况
                     Increment: 执行该行代码后新增的内存
         2.样例
            程序：
               同标题四脚本
            输出：
>mprof run test3.py
   mprof: Sampling memory every 0.1s
   running new process
   running as a Python program...
   共出现213120次
>ls
   mprofile_20211231162824.dattest3.py

>mprof plot
   Using last profile data.

               这样就可以查看图片

六、使用guppy衡量堆占用情况
安装命令：pip3 install guppy3
使用方式：1.先导入模块 from guppy import hpy
         2.实例化对象check_ob=hpy()
         3.设置断点（基准线）（可选） check_ob.setrelheap()
            注：设置一个断点，当前对象情况就相当于基准线，后续heap数据，都是以次基准线，计算差额，如未设置基准，后续heap则以0为基准。
         4.在需要查看占用情况的地方，打印数据。 print(check_ob.heap())
样例：
   1.代码
import random
#导入hpy模块
from guppy import hpy
def get_random_str(num:int)->str:
   """
   该函数用于生成一个随机字符串，num为循环次数
   """
   end_word=""
   for i in range(num):
            end_word=end_word+str(random.random()*100)+str(i)
   return end_word
def check_num(num:int,str_data:str)->int:
   """
   检查num存在于字符串中的次数
   """
   num=str(num)
   all_num=0
   for i in str_data:
            if i==num:
                     all_num=all_num+1
   return all_num
#实例化hpy模块
check_ob=hpy()
#设置一个断点，当前对象情况就相当于基准线，后续heap数据，都是以此基准线，计算差额
#如未设置基准，后续heap则以0为基准
check_ob.setrelheap()
#循环100000次，生成随机字符串
num=100000
#打印当前堆栈占用情况
print(check_ob.heap())
#获得生成的随机字符串
string=get_random_str(num=num)
#打印当前堆栈占用情况
print(check_ob.heap())
#声明检查的数字
check_num_data=3
#循环，依次检查该数字，确定出现次数
all_num=check_num(num=check_num_data,str_data=string)
#输出结果
print("共出现%d次"%(all_num,))

         2.输出内容
> python3 test3.py
   Partition of a set of 14 objects. Total size = 2048 bytes.
      IndexCount % Size % Cumulative% Kind (class / dict of class)
            0    321    82440    82440 dict (no owner)
            1    1 7    40820    123260 types.FrameType
            2    429    26413    149673 tuple
            3    1 7    20010    169683 bytes
            4    1 7    104 5    180088 dict of _pydevd_bundle.pydevd_net_command.NetCommand
            5    1 7    88 4    188892 list
            6    1 7    72 4    196096 types.BuiltinMethodType
            7    1 7    48 2    200898 _pydevd_bundle.pydevd_net_command.NetCommand
            8    1 7    40 2    2048 100 _pydevd_bundle.pydevd_cython.SafeCallWrapper
   Partition of a set of 42 objects. Total size = 2209807 bytes.
      IndexCount % Size % Cumulative% Kind (class / dict of class)
            0    1 22207311 100 2207311 100 str
            1 2355 1520 0 2208831 100 tuple
            2    1 2    408 0 2209239 100 types.FrameType
            3 1229    336 0 2209575 100 int
            4    3 7    120 0 2209695 100 _thread.lock
            5    1 2    72 0 2209767 100 types.BuiltinMethodType
            6    1 2    40 0 2209807 100 _pydevd_bundle.pydevd_cython.SafeCallWrapper
   共出现213555次

七、使用dowser衡量堆占用情况
安装命令：pip install cherrypy dowser-py3
   注：库未跟随最新python3更新，无法在python3上使用，暂时保留

八、使用dis查看Cpython字节码情况
意义：dis是python内建函数，无需安装
   字节码能够直观的展示出python程序运行过程中，建立的变量和执行的步骤，一般来说，步骤和变量越小，python程序运行的越快，时间和空间消耗越少。
样例1：
   我们实现一个从0到指定值，依次递增加和的函数，来查看它的字节码
   程序：
#导入dis函数
import dis
def run(num=10000):
   """
   该函数是用来计算，从0依次加到目标数字，所得到的结果。
   如输入为5，则计算为 0+1+2+3+4+5=15
   如输入为6，则计算为 0+1+2+3+4+5+6=21
   """
   #初始化最终结果变量
   answer=0
   #依次从零遍历各个数字，为包括最后一位，所以截至数字需要+1
   for i in range(0,num+1):
            #进行相加
            answer=answer+i
   #返回结果
   return answer
#打印字节码情况
print(dis.dis(run))
   结果：
>python3 test2.py
      11       0 LOAD_CONST             1 (0)
                              2 STORE_FAST             1 (answer)

      13       4 LOAD_GLOBAL          0 (range)
                              6 LOAD_CONST             1 (0)
                              8 LOAD_FAST             0 (num)
                              10 LOAD_CONST             2 (1)
                              12 BINARY_ADD
                              14 CALL_FUNCTION          2
                              16 GET_ITER
                     >> 18 FOR_ITER             12 (to 32)
                              20 STORE_FAST             2 (i)

      15       22 LOAD_FAST             1 (answer)
                              24 LOAD_FAST             2 (i)
                              26 BINARY_ADD
                              28 STORE_FAST             1 (answer)
                              30 JUMP_ABSOLUTE       18

      17 >> 32 LOAD_FAST             1 (answer)
                              34 RETURN_VALUE
   None
         第一列指原始文件的行数
         第二列指 >>表示指向其它代码的跳转
         第三列指操作地址和操作名
         第四列指操作参数
         第五列标记原始的python参数

   上述程序，直观的展示了求和的计算过程，我们现在将它精简一下，编写run2函数：
         精简函数内容：
def run2(num=10000):
   """
   该函数是用来计算，从0依次加到目标数字，所得到的结果。
   如输入为5，则计算为 0+1+2+3+4+5=15
   如输入为6，则计算为 0+1+2+3+4+5+6=21
   """
   #range(0,num+1)返回一个迭代器，由sum计算这个迭代器的加和
   return sum(range(0,num+1))

         我们进行下比较计算，比较这两个函数的功能是否完全相同
            程序：
#导入dis函数
import dis
def run(num=10000):
   """
   该函数是用来计算，从0依次加到目标数字，所得到的结果。
   如输入为5，则计算为 0+1+2+3+4+5=15
   如输入为6，则计算为 0+1+2+3+4+5+6=21
   """
   #初始化最终结果变量
   answer=0
   #依次从零遍历各个数字，为包括最后一位，所以截至数字需要+1
   for i in range(0,num+1):
            #进行相加
            answer=answer+i
   #返回结果
   return answer
def run2(num=10000):
   """
   该函数是用来计算，从0依次加到目标数字，所得到的结果。
   如输入为5，则计算为 0+1+2+3+4+5=15
   如输入为6，则计算为 0+1+2+3+4+5+6=21
   """
   #range(0,num+1)返回一个迭代器，由sum计算这个迭代器的加和
   return sum(range(0,num+1))
#我们分别用两个函数计算0到500的加和，看结果是否相同，以验证功能是否完全相同
num=500
print(run(num))
print(run2(num))
            结果：
>python3 test2.py
   125250
   125250
         由此证明了，两个函数功能完全相同。
         现在，我们比较两个函数的字节码，以判断哪个函数对时间和空间的消耗更小。
            代码：
#导入dis函数
import dis
def run(num=10000):
   """
   该函数是用来计算，从0依次加到目标数字，所得到的结果。
   如输入为5，则计算为 0+1+2+3+4+5=15
   如输入为6，则计算为 0+1+2+3+4+5+6=21
   """
   #初始化最终结果变量
   answer=0
   #依次从零遍历各个数字，为包括最后一位，所以截至数字需要+1
   for i in range(0,num+1):
            #进行相加
            answer=answer+i
   #返回结果
   return answer
def run2(num=10000):
   """
   该函数是用来计算，从0依次加到目标数字，所得到的结果。
   如输入为5，则计算为 0+1+2+3+4+5=15
   如输入为6，则计算为 0+1+2+3+4+5+6=21
   """
   #range(0,num+1)返回一个迭代器，由sum计算这个迭代器的加和
   return sum(range(0,num+1))
#我们分别输出两个函数的字节码数据
print(dis.dis(run))
print("-"*20)
print(dis.dis(run2))

            结果：
>python3 test2.py
      11       0 LOAD_CONST             1 (0)
                              2 STORE_FAST             1 (answer)

      13       4 LOAD_GLOBAL          0 (range)
                              6 LOAD_CONST             1 (0)
                              8 LOAD_FAST             0 (num)
                              10 LOAD_CONST             2 (1)
                              12 BINARY_ADD
                              14 CALL_FUNCTION          2
                              16 GET_ITER
                     >> 18 FOR_ITER             12 (to 32)
                              20 STORE_FAST             2 (i)

      15       22 LOAD_FAST             1 (answer)
                              24 LOAD_FAST             2 (i)
                              26 BINARY_ADD
                              28 STORE_FAST             1 (answer)
                              30 JUMP_ABSOLUTE       18

      17 >> 32 LOAD_FAST             1 (answer)
                              34 RETURN_VALUE
   None
   --------------------
      26       0 LOAD_GLOBAL          0 (sum)
                              2 LOAD_GLOBAL          1 (range)
                              4 LOAD_CONST             1 (0)
                              6 LOAD_FAST             0 (num)
                              8 LOAD_CONST             2 (1)
                              10 BINARY_ADD
                              12 CALL_FUNCTION          2
                              14 CALL_FUNCTION          1
                              16 RETURN_VALUE
   None
   由此可以看到，run2函数无论是变量数量，还是操作步骤，都比run函数好的多。

九、使用No-op对方便你的调试
目的：在不删除@profile装饰器的情况下，让代码能在生产环境正常运行，以避免在发布代码时，需要大量修改测试版本的代码
使用方法：代码开始部分，增加判断
   片段代码：
import builtins
#对是否存在profile判断，如不存在，则创建一个装饰器
if not hasattr(builtins,'profile'):
   def profile(func):
            def inner(*args,**kwargs):
                     return func(*args,**kwargs)
            return inner

   完整代码：
import random
import builtins
#对是否存在profile判断，如不存在，则创建一个装饰器
if not hasattr(builtins,'profile'):
   def profile(func):
            def inner(*args,**kwargs):
                     return func(*args,**kwargs)
            return inner
#增加监控装饰器
@profile
def get_random_str(num:int)->str:
   """
   该函数用于生成一个随机字符串，num为循环次数
   """
   end_word=""
   for i in range(num):
            end_word=end_word+str(random.random()*100)+str(i)
   return end_word
def check_num(num:int,str_data:str)->int:
   """
   检查num存在于字符串中的次数
   """
   num=str(num)
   all_num=0
   for i in str_data:
            if i==num:
                     all_num=all_num+1
   return all_num
#循环100000次，生成随机字符串
num=100000
#获得生成的随机字符串
string=get_random_str(num=num)
#声明检查的数字
check_num_data=3
#循环，依次检查该数字，确定出现次数
all_num=check_num(num=check_num_data,str_data=string)
#输出结果
print("共出现%d次"%(all_num,))

         输出：
>kernprof -l -v test4.py
   共出现214384次
   Wrote profile results to test4.py.lprof
   Timer unit: 1e-06 s

   Total time: 16.5752 s
   File: test4.py
   Function: get_random_str at line 10

   Line #    Hits       TimePer Hit % TimeLine Contents
   ==============================================================
            10                                        @profile
            11                                        def get_random_str(num:int)->str:
            12                                           """
            13                                           该函数用于生成一个随机字符串，num为循环次数
            14                                           """
            15       1       1.0    1.0    0.0    end_word=""
            16 100001    60320.0    0.6    0.4    for i in range(num):
            17 100000 16514837.0 165.1 99.6       end_word=end_word+str(random.random()*100)+str(i)
            18       1       1.0    1.0    0.0    return end_word

            可以看到cpu占用分析正常
>python3 -m memory_profiler test4.py
   共出现213947次
   Filename: test4.py

   Line # Mem usage IncrementOccurrences Line Contents
   =============================================================
            10 34.992 MiB 34.992 MiB       1 @profile
            11                                     def get_random_str(num:int)->str:
            12                                           """
            13                                           该函数用于生成一个随机字符串，num为循环次数
            14                                           """
            15 34.992 MiB 0.000 MiB       1    end_word=""
            16 41.289 MiB -96355.234 MiB    100001    for i in range(num):
            17 41.289 MiB -96348.938 MiB    100000       end_word=end_word+str(random.random()*100)+str(i)
            18 41.160 MiB -0.129 MiB       1    return end_word

            内存分析正常
> python3 test4.py
   共出现213095次

            直接运行正常
            由此，在脚本头部增加该步骤后，脚本无需变动即可进行CPU和内存分析，完成分析后，无需删除装饰器，即可发布

十、分析时的注意事项
1.注意分析时的温度环境，CPU温度上升，会导致运行变慢。
2.使用笔记本进行分析时，注意是否一直接通电源，使用电池供电，笔记本性能会大幅度下降，从而影响分析结果。
3.多次实验，查看结果。
4.适当的重启电脑，进行分析。
5.不要在运行高耗能的程序时，进行性能分析。
6.其它事项，请参考原文。

qiaoyu 发表于 2022-1-4 22:15

优秀，没看完

jnez112358 发表于 2022-1-5 08:49

正在学习中，谢

xinyangtuina 发表于 2022-1-5 10:30

这个应该在批量正式环境下有用
先收藏

ZHANchenggu 发表于 2022-1-5 11:23

笔记整理的不错，好评

页: [1]

吾爱破解 - 52pojie.cn's Archiver

py笔记-python的性能分析