加班上通宵无聊，爬本小说读读

carole1102 · 发表于 2019-11-30 22:48

听说斗破苍穹，恐怖如斯，爬下来瞧瞧。。。。。

import requests
import re
import time

hds = {
'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.120 Safari/537.36'
}

f = open(r'e:\book.txt','a+',encoding='utf-8')

def get_txt(url):
res = requests.get(url,headers = hds)
if res.status_code == 200:
      contents = re.findall('<p>(.*?)</p>',res.content.decode('utf-8'),re.S)
      for content in contents:
         f.write(content + '\n')
else:
      pass

if __name__ == '__main__':
urls = ['http://www.doupoxs.com/doupocangqiong/{}.html'.format(str(i)) for i in range(2,1647)]
for url in urls:
      get_txt(url)
time.sleep(1)

f.close()

autist · 发表于 2019-11-30 23:48

刚接触两天的小萌新睁大了双眼

zxshouxian · 发表于 2019-12-1 17:52

不错呀厉害

帐号		自动登录	找回密码
密码			注册[Register]

[Python 转载] 加班上通宵无聊，爬本小说读读