Python之requests模块详解

l2430478 · 发表于 2021-1-14 19:32

越实践别人的代码，越发感觉知识匮乏，特别是在用模块后，知其然不知其所以然，今日特对用的最频繁的requests模块进行深入学习。
模块说明：requests是使用Apache2 licensed 许可证的HTTP库，用python编写，比urllib2模块更简洁。
Request支持HTTP连接保持和连接池，支持使用cookie保持会话，支持文件上传，支持自动响应内容的编码，支持国际化的URL和POST数据自动编码。在python内置模块的基础上进行了高度的封装，从而使得python进行网络请求时，变得人性化，使用Requests可以轻而易举的完成浏览器可有的任何操作。现代，国际化，友好。requests会自动实现持久连接keep-alive
1）导入模块

[Python] 纯文本查看 复制代码

1	`import` `requests`

相信这个都能看懂。

2）发送请求的简洁

示例代码：获取一个网页（个人github）

[Python] 纯文本查看 复制代码

1

2

3

4

import requests
 
r = requests.get('https://github.com/Ranxf')       # 最基本的不带参数的get请求
r1 = requests.get(url='http://dict.baidu.com/s', params={'wd': 'python'})      # 带参数的get请求

我们就可以使用该方式使用以下各种操作

[Python] 纯文本查看 复制代码

1

2

3

4

5

6

1   requests.get(‘[url]https://github.com/timeline.json[/url]’)                                # GET请求
2   requests.post(“[url]http://httpbin.org/post[/url]”)                                        # POST请求
3   requests.put(“[url]http://httpbin.org/put[/url]”)                                          # PUT请求
4   requests.delete(“[url]http://httpbin.org/delete[/url]”)                                    # DELETE请求
5   requests.head(“[url]http://httpbin.org/get[/url]”)                                         # HEAD请求
6   requests.options(“[url]http://httpbin.org/get[/url]” )                                     # OPTIONS请求

3）为url传递参数

[Python] 纯文本查看 复制代码

1

2

3

4

>>> url_params = {'key':'value'}       #    字典传递参数，如果值为None的键不会被添加到url中
>>> r = requests.get('your url',params = url_params)
>>> print(r.url)
　　your url?key=value

4）响应的内容

[Python] 纯文本查看 复制代码

01

02

03

04

05

06

07

08

09

10

11

12

13

r.encoding                       #获取当前的编码
r.encoding = 'utf-8'             #设置编码
r.text                           #以encoding解析返回内容。字符串方式的响应体，会自动根据响应头部的字符编码进行解码。
r.content                        #以字节形式（二进制）返回。字节方式的响应体，会自动为你解码 gzip 和 deflate 压缩。
 
r.headers                        #以字典对象存储服务器响应头，但是这个字典比较特殊，字典键不区分大小写，若键不存在则返回None
 
r.status_code                     #响应状态码
r.raw                             #返回原始响应体，也就是 urllib 的 response 对象，使用 r.raw.read()   
r.ok                              # 查看r.ok的布尔值便可以知道是否登陆成功
 #*特殊方法*#
r.json()                         #Requests中内置的JSON解码器，以json形式返回,前提返回的内容确保是json格式的，不然解析出错会抛异常
r.raise_for_status()             #失败请求(非200响应)抛出异常

post发送json请求：

[Python] 纯文本查看 复制代码

1

2

3

4

5

import requests
import json
3 
r = requests.post('https://api.github.com/some/endpoint', data=json.dumps({'some': 'data'}))
print(r.json())

5）定制头和cookie信息

[Python] 纯文本查看 复制代码

1

2

3

header = {'user-agent': 'my-app/0.0.1''}
cookie = {'key':'value'}
 r = requests.get/post('your url',headers=header,cookies=cookie)

[Python] 纯文本查看 复制代码

1

2

3

4

5

6

data = {'some': 'data'}
headers = {'content-type': 'application/json',
           'User-Agent': 'Mozilla/5.0 (X11; Ubuntu; Linux x86_64; rv:22.0) Gecko/20100101 Firefox/22.0'}
  
r = requests.post('https://api.github.com/some/endpoint', data=data, headers=headers)
print(r.text)

6）响应状态码
使用requests方法后，会返回一个response对象，其存储了服务器响应的内容，如上实例中已经提到的 r.text、r.status_code……
获取文本方式的响应体实例：当你访问 r.text 之时，会使用其响应的文本编码进行解码，并且你可以修改其编码让 r.text 使用自定义的编码进行解码。

[Python] 纯文本查看 复制代码

1

2

3

4

r = requests.get('http://www.itwhy.org')
print(r.text, '\n{}\n'.format('*'*79), r.encoding)
r.encoding = 'GBK'
print(r.text, '\n{}\n'.format('*'*79), r.encoding)

实例代码：

[Python] 纯文本查看 复制代码

1

2

3

4

5

6

7

import requests
 
r = requests.get('https://github.com/Ranxf')       # 最基本的不带参数的get请求
print(r.status_code)                               # 获取返回状态
r1 = requests.get(url='http://dict.baidu.com/s', params={'wd': 'python'})      # 带参数的get请求
print(r1.url)
print(r1.text)        # 打印解码后的返回数据

运行后得到：

[Python] 纯文本查看 复制代码

1

2

3

4

5

6

/usr/bin/python3.5 /home/rxf/python3_1000/1000/python3_server/python3_requests/demo1.py
200
[url]http://dict.baidu.com/s?wd=python[/url]
…………
 
Process finished with exit code 0

r.status_code #如果不是200，可以使用 r.raise_for_status() 抛出异常

7）响应

[Python] 纯文本查看 复制代码

1

2

3

4

r.headers                                  #返回字典类型,头信息
r.requests.headers                         #返回发送到服务器的头信息
r.cookies                                  #返回cookie
r.history                                  #返回重定向信息,当然可以在请求是加上allow_redirects = false 阻止重定向

8）超时

[Python] 纯文本查看 复制代码

1	`r` `=` `requests.get('url',timeout=1)` `#设置秒数超时，仅对于连接有效`

9)会话对象，能够跨请求保持某些参数

[Python] 纯文本查看 复制代码

1

2

3

4

5

s = requests.Session()
s.auth = ('auth','passwd')
s.headers = {'key':'value'}
r = s.get('url')
r1 = s.get('url1')

10）代{过}{滤}理

[Python] 纯文本查看 复制代码

1 2	`proxies` `=` `{'http':'ip1','https':'ip2'` `}` `requests.get('url',proxies=proxies)`

汇总如下：

[Python] 纯文本查看 复制代码

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

36

37

38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

55

56

57

58

59

60

61

62

63

64

65

66

67

68

69

70

71

72

73

74

75

76

77

78

79

80

81

82

# HTTP请求类型
# get类型
r = requests.get('https://github.com/timeline.json')
# post类型
r = requests.post("http://m.ctrip.com/post")
# put类型
r = requests.put("http://m.ctrip.com/put")
# delete类型
r = requests.delete("http://m.ctrip.com/delete")
# head类型
r = requests.head("http://m.ctrip.com/head")
# options类型
r = requests.options("http://m.ctrip.com/get")
 
# 获取响应内容
print(r.content) #以字节的方式去显示，中文显示为字符
print(r.text) #以文本的方式去显示
 
#URL传递参数
payload = {'keyword': '香港', 'salecityid': '2'}
r = requests.get("http://m.ctrip.com/webapp/tourvisa/visa_list", params=payload) 
print（r.url） #示例为[url]http://m.ctrip.com/webapp/tourvisa/visa_list?salecityid=2&keyword=[/url]香港
 
#获取/修改网页编码
r = requests.get('https://github.com/timeline.json')
print （r.encoding）
 
 
#json处理
r = requests.get('https://github.com/timeline.json')
print（r.json()） # 需要先import json    
 
# 定制请求头
url = 'http://m.ctrip.com'
headers = {'User-Agent' : 'Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 4 Build/JOP40D) AppleWebKit/535.19 (KHTML, like Gecko) Chrome/18.0.1025.166 Mobile Safari/535.19'}
r = requests.post(url, headers=headers)
print （r.request.headers)
 
#复杂post请求
url = 'http://m.ctrip.com'
payload = {'some': 'data'}
r = requests.post(url, data=json.dumps(payload)) #如果传递的payload是string而不是dict，需要先调用dumps方法格式化一下
 
# post多部分编码文件
url = 'http://m.ctrip.com'
files = {'file': open('report.xls', 'rb')}
r = requests.post(url, files=files)
 
# 响应状态码
r = requests.get('http://m.ctrip.com')
print(r.status_code)
     
# 响应头
r = requests.get('http://m.ctrip.com')
print (r.headers)
print (r.headers['Content-Type'])
print (r.headers.get('content-type')) #访问响应头部分内容的两种方式
     
# Cookies
url = 'http://example.com/some/cookie/setting/url'
r = requests.get(url)
r.cookies['example_cookie_name']    #读取cookies
     
url = 'http://m.ctrip.com/cookies'
cookies = dict(cookies_are='working')
r = requests.get(url, cookies=cookies) #发送cookies
 
#设置超时时间
r = requests.get('http://m.ctrip.com', timeout=0.001)
 
#设置访问代{过}{滤}理
proxies = {
           "http": "http://10.10.1.10:3128",
           "https": "http://10.10.1.100:4444",
          }
r = requests.get('http://m.ctrip.com', proxies=proxies)
 
 
#如果代{过}{滤}理需要用户名和密码，则需要这样：
proxies = {
    "http": "http://user:pass@10.10.1.10:3128/",
}

11)GET请求代码示例

[Python] 纯文本查看 复制代码

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

# 1、无参数实例
 
import requests
 
ret = requests.get('https://github.com/timeline.json')
 
print(ret.url)
print(ret.text)
 
# 2、有参数实例
 
import requests
 
payload = {'key1': 'value1', 'key2': 'value2'}
ret = requests.get("http://httpbin.org/get", params=payload)
 
print(ret.url)
print(ret.text)

12)POST请求代码示例

[Python] 纯文本查看 复制代码

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

23

# 1、基本POST实例
   
import requests
   
payload = {'key1': 'value1', 'key2': 'value2'}
ret = requests.post("http://httpbin.org/post", data=payload)
   
print(ret.text)
   
   
# 2、发送请求头和数据实例
   
import requests
import json
   
url = 'https://api.github.com/some/endpoint'
payload = {'some': 'data'}
headers = {'content-type': 'application/json'}
   
ret = requests.post(url, data=json.dumps(payload), headers=headers)
   
print(ret.text)
print(ret.cookies)

13)请求参数

[Python] 纯文本查看 复制代码

01

02

03

04

05

06

07

08

09

10

11

12

13

14

15

16

17

18

19

20

21

22

23

24

25

26

27

28

29

30

31

32

33

34

35

def request(method, url, **kwargs):
    """Constructs and sends a :class:`Request <Request>`.
 
    :param method: method for the new :class:`Request` object.
    :param url: URL for the new :class:`Request` object.
    :param params: (optional) Dictionary or bytes to be sent in the query string for the :class:`Request`.
    :param data: (optional) Dictionary, bytes, or file-like object to send in the body of the :class:`Request`.
    :param json: (optional) json data to send in the body of the :class:`Request`.
    :param headers: (optional) Dictionary of HTTP Headers to send with the :class:`Request`.
    :param cookies: (optional) Dict or CookieJar object to send with the :class:`Request`.
    :param files: (optional) Dictionary of ``'name': file-like-objects`` (or ``{'name': file-tuple}``) for multipart encoding upload.
        ``file-tuple`` can be a 2-tuple ``('filename', fileobj)``, 3-tuple ``('filename', fileobj, 'content_type')``
        or a 4-tuple ``('filename', fileobj, 'content_type', custom_headers)``, where ``'content-type'`` is a string
        defining the content type of the given file and ``custom_headers`` a dict-like object containing additional headers
        to add for the file.
    :param auth: (optional) Auth tuple to enable Basic/Digest/Custom HTTP Auth.
    :param timeout: (optional) How long to wait for the server to send data
        before giving up, as a float, or a :ref:`(connect timeout, read
        timeout) <timeouts>` tuple.
    :type timeout: float or tuple
    :param allow_redirects: (optional) Boolean. Set to True if POST/PUT/DELETE redirect following is allowed.
    :type allow_redirects: bool
    :param proxies: (optional) Dictionary mapping protocol to the URL of the proxy.
    :param verify: (optional) whether the SSL cert will be verified. A CA_BUNDLE path can also be provided. Defaults to ``True``.
    :param stream: (optional) if ``False``, the response content will be immediately downloaded.
    :param cert: (optional) if String, path to ssl client cert file (.pem). If Tuple, ('cert', 'key') pair.
    :return: :class:`Response <Response>` object
    :rtype: requests.Response
 
    Usage::
 
      >>> import requests
      >>> req = requests.request('GET', 'http://httpbin.org/get')
      <Response [200]>
    """

参数示例代码

[Python] 纯文本查看 复制代码

001

002

003

004

005

006

007

008

009

010

011

012

013

014

015

016

017

018

019

020

021

022

023

024

025

026

027

028

029

030

031

032

033

034

035

036

037

038

039

040

041

042

043

044

045

046

047

048

049

050

051

052

053

054

055

056

057

058

059

060

061

062

063

064

065

066

067

068

069

070

071

072

073

074

075

076

077

078

079

080

081

082

083

084

085

086

087

088

089

090

091

092

093

094

095

096

097

098

099

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

200

201

202

203

204

205

206

207

208

209

210

211

212

213

214

215

216

217

218

219

220

221

222

223

224

225

226

def param_method_url():
    # requests.request(method='get', url='http://127.0.0.1:8000/test/')
    # requests.request(method='post', url='http://127.0.0.1:8000/test/')
    pass
 
 
def param_param():
    # - 可以是字典
    # - 可以是字符串
    # - 可以是字节（ascii编码以内）
 
    # requests.request(method='get',
    # url='http://127.0.0.1:8000/test/',
    # params={'k1': 'v1', 'k2': '水电费'})
 
    # requests.request(method='get',
    # url='http://127.0.0.1:8000/test/',
    # params="k1=v1&k2=水电费&k3=v3&k3=vv3")
 
    # requests.request(method='get',
    # url='http://127.0.0.1:8000/test/',
    # params=bytes("k1=v1&k2=k2&k3=v3&k3=vv3", encoding='utf8'))
 
    # 错误
    # requests.request(method='get',
    # url='http://127.0.0.1:8000/test/',
    # params=bytes("k1=v1&k2=水电费&k3=v3&k3=vv3", encoding='utf8'))
    pass
 
 
def param_data():
    # 可以是字典
    # 可以是字符串
    # 可以是字节
    # 可以是文件对象
 
    # requests.request(method='POST',
    # url='http://127.0.0.1:8000/test/',
    # data={'k1': 'v1', 'k2': '水电费'})
 
    # requests.request(method='POST',
    # url='http://127.0.0.1:8000/test/',
    # data="k1=v1; k2=v2; k3=v3; k3=v4"
    # )
 
    # requests.request(method='POST',
    # url='http://127.0.0.1:8000/test/',
    # data="k1=v1;k2=v2;k3=v3;k3=v4",
    # headers={'Content-Type': 'application/x-www-form-urlencoded'}
    # )
 
    # requests.request(method='POST',
    # url='http://127.0.0.1:8000/test/',
    # data=open('data_file.py', mode='r', encoding='utf-8'), # 文件内容是：k1=v1;k2=v2;k3=v3;k3=v4
    # headers={'Content-Type': 'application/x-www-form-urlencoded'}
    # )
    pass
 
 
def param_json():
    # 将json中对应的数据进行序列化成一个字符串，json.dumps(...)
    # 然后发送到服务器端的body中，并且Content-Type是 {'Content-Type': 'application/json'}
    requests.request(method='POST',
                     url='http://127.0.0.1:8000/test/',
                     json={'k1': 'v1', 'k2': '水电费'})
 
 
def param_headers():
    # 发送请求头到服务器端
    requests.request(method='POST',
                     url='http://127.0.0.1:8000/test/',
                     json={'k1': 'v1', 'k2': '水电费'},
                     headers={'Content-Type': 'application/x-www-form-urlencoded'}
                     )
 
 
def param_cookies():
    # 发送Cookie到服务器端
    requests.request(method='POST',
                     url='http://127.0.0.1:8000/test/',
                     data={'k1': 'v1', 'k2': 'v2'},
                     cookies={'cook1': 'value1'},
                     )
    # 也可以使用CookieJar（字典形式就是在此基础上封装）
    from http.cookiejar import CookieJar
    from http.cookiejar import Cookie
 
    obj = CookieJar()
    obj.set_cookie(Cookie(version=0, name='c1', value='v1', port=None, domain='', path='/', secure=False, expires=None,
                          discard=True, comment=None, comment_url=None, rest={'HttpOnly': None}, rfc2109=False,
                          port_specified=False, domain_specified=False, domain_initial_dot=False, path_specified=False)
                   )
    requests.request(method='POST',
                     url='http://127.0.0.1:8000/test/',
                     data={'k1': 'v1', 'k2': 'v2'},
                     cookies=obj)
 
 
def param_files():
    # 发送文件
    # file_dict = {
    # 'f1': open('readme', 'rb')
    # }
    # requests.request(method='POST',
    # url='http://127.0.0.1:8000/test/',
    # files=file_dict)
 
    # 发送文件，定制文件名
    # file_dict = {
    # 'f1': ('test.txt', open('readme', 'rb'))
    # }
    # requests.request(method='POST',
    # url='http://127.0.0.1:8000/test/',
    # files=file_dict)
 
    # 发送文件，定制文件名
    # file_dict = {
    # 'f1': ('test.txt', "hahsfaksfa9kasdjflaksdjf")
    # }
    # requests.request(method='POST',
    # url='http://127.0.0.1:8000/test/',
    # files=file_dict)
 
    # 发送文件，定制文件名
    # file_dict = {
    #     'f1': ('test.txt', "hahsfaksfa9kasdjflaksdjf", 'application/text', {'k1': '0'})
    # }
    # requests.request(method='POST',
    #                  url='http://127.0.0.1:8000/test/',
    #                  files=file_dict)
 
    pass
 
 
def param_auth():
    from requests.auth import HTTPBasicAuth, HTTPDigestAuth
 
    ret = requests.get('https://api.github.com/user', auth=HTTPBasicAuth('wupeiqi', 'sdfasdfasdf'))
    print(ret.text)
 
    # ret = requests.get('http://192.168.1.1',
    # auth=HTTPBasicAuth('admin', 'admin'))
    # ret.encoding = 'gbk'
    # print(ret.text)
 
    # ret = requests.get('http://httpbin.org/digest-auth/auth/user/pass', auth=HTTPDigestAuth('user', 'pass'))
    # print(ret)
    #
 
 
def param_timeout():
    # ret = requests.get('http://google.com/', timeout=1)
    # print(ret)
 
    # ret = requests.get('http://google.com/', timeout=(5, 1))
    # print(ret)
    pass
 
 
def param_allow_redirects():
    ret = requests.get('http://127.0.0.1:8000/test/', allow_redirects=False)
    print(ret.text)
 
 
def param_proxies():
    # proxies = {
    # "http": "61.172.249.96:80",
    # "https": "http://61.185.219.126:3128",
    # }
 
    # proxies = {'http://10.20.1.128': 'http://10.10.1.10:5323'}
 
    # ret = requests.get("http://www.proxy360.cn/Proxy", proxies=proxies)
    # print(ret.headers)
 
 
    # from requests.auth import HTTPProxyAuth
    #
    # proxyDict = {
    # 'http': '77.75.105.165',
    # 'https': '77.75.105.165'
    # }
    # auth = HTTPProxyAuth('username', 'mypassword')
    #
    # r = requests.get("http://www.google.com", proxies=proxyDict, auth=auth)
    # print(r.text)
 
    pass
 
 
def param_stream():
    ret = requests.get('http://127.0.0.1:8000/test/', stream=True)
    print(ret.content)
    ret.close()
 
    # from contextlib import closing
    # with closing(requests.get('http://httpbin.org/get', stream=True)) as r:
    # # 在此处理响应。
    # for i in r.iter_content():
    # print(i)
 
 
def requests_session():
    import requests
 
    session = requests.Session()
 
    ### 1、首先登陆任何页面，获取cookie
 
    i1 = session.get(url="http://dig.chouti.com/help/service")
 
    ### 2、用户登陆，携带上一次的cookie，后台对cookie中的 gpsd 进行授权
    i2 = session.post(
        url="http://dig.chouti.com/login",
        data={
            'phone': "8615131255089",
            'password': "xxxxxx",
            'oneMonth': ""
        }
    )
 
    i3 = session.post(
        url="http://dig.chouti.com/link/vote?linksId=8589623",
    )
    print(i3.text)
 参数示例代码

吾爱支持 · 发表于 2021-1-14 19:56

感谢分享，收藏送心心！！

VIP007 · 发表于 2021-1-14 20:02

学习了收藏送人最好礼物

ixeliap · 发表于 2021-1-14 22:45

不错，点个赞

a3ddr · 发表于 2021-1-14 22:58

收藏了，原来学得真不够详细

通过学习这个，学到了requests非常全面的内容

谢谢

fisk9r · 发表于 2021-1-14 23:04

楼主写的很详细，mark，同样在学习爬虫

Dll30 · 发表于 2021-1-16 07:00

写的很好就是有点小瑕疵标题太大了排版不太美观

支持一下

Xingyemao · 发表于 2021-1-16 13:39

写的非常详细，感谢！

夏520 · 发表于 2021-1-16 14:40

提示: 作者被禁止或删除内容自动屏蔽

xjshuaishuai · 发表于 2021-1-16 14:56

收藏学习了，谢谢！

帐号		自动登录	找回密码
密码			注册[Register]

夏520 夏520 当前离线好友阅读权限 0 听众最后登录 1970-1-1 头像被屏蔽	夏520 发表于 2021-1-16 14:40 提示: 作者被禁止或删除内容自动屏蔽
夏520 夏520 当前离线好友阅读权限 0 听众最后登录 1970-1-1 头像被屏蔽
	回复支持举报

[学习记录] Python之requests模块详解

免费评分

本帖被以下淘专辑推荐:

浏览过的版块