吾爱破解 - 52pojie.cn

 找回密码
 注册[Register]

QQ登录

只需一步,快速开始

查看: 927|回复: 5
收起左侧

[讨论] Python如何使用Beautifulsoup库实现同样功能?(读取HTML上的分级列表)

[复制链接]
guyinqian 发表于 2022-4-23 21:04
Python如何使用Beautifulsoup库实现同样功能?(读取HTML上的分级列表)

[Python] 纯文本查看 复制代码
import re

def fixdesc(fileDesc):
    pattern = r'[\r\n\t]+'  # 过滤科学文库页面上格式化产生的换行符等
    fileDesc = re.sub(pattern, "", fileDesc)
    return fileDesc.replace('<br/>', '').replace('<br/', '').replace('<br', '').replace('<b', '')

resp='''

<div style="background-color: #313d4f">
	<div class="com_width1200">
		<div class="container">
			<div class="header_top row">
				<div class="col-md-4 col-xs-12 head_top_div">
					<div class="row">
						<div class="col-md-3 col-xs-12">
							<span id="userIp">82.157.123.54</span>
							<input type="hidden" class="user_ip_hidden" value="82.157.123.54" />
						</div>
						<div class="col-md-9  col-xs-12">
							<span id="org_name"><span></span></span>
						</div>
					</div>
				</div>
				<div class="col-md-3  col-xs-12 head_top_div" style="text-align:center;">
					<!--<span class="online" style="display:none;"><a href="/shop/book/News/detail.do" style="color:#ffff00;font-size:16px;"><i class="fa fa-bell-o"></i>科学文库有奖问答活动</a></span>-->
					<span><a href="https://mp.weixin.qq.com/s/QGkXz3bYp_nFK6y2SdolEA" style="color:#ffff00;"><i class="fa fa-bell-o"></i>&nbsp;校外访问科学出版社系列数据库的方法</a></span>
					<!--<span style="display:none;"><a href="javascript:;"><i class="fa fa-bell-o"></i>2020科学文库有奖问答活动</a></span>-->
				</div>
				<div class="col-md-5 text-right  col-xs-12 head_top_div">			
				<!-- 	<img src="/kxwk5_style/images/avatar.png" class="img-circle" style="width: 30px"> -->
					
 					
					<a id="register" href="/shop/member/Member/create.do"><span class="">注册</span></a>
					<a id="example" href="/shop/main/Login/ssoLogin.do"><span class="margin_left_side index_about_us">登录</span></a>
					<!--<span id="example" data-toggle="modal" data-target="#loginModal" class="margin_left_side index_about_us pointer_link login_btn_modal">登录</span>-->
					<a id="userName" href="/shop/member/Member/show.do" title="个人中心"><span class="margin_left_side"></span></a>
					<img title="退出" id="logoutShow" class="margin_left_side index_about_us pointer_link" src="/kxwk5_style/images/sign_out.png">
											<!--<span class="index_about_us pointer_link"><a href="/shop/announcement1.html">重要公告 </a></span>-->
						<span class="index_about_us pointer_link"><a href="/shop/helpCenter1.html">关于我们 </a></span>
					<!--<span class=""><a href="http://159.226.29.161/shop/main/Login/shopFrame.do">旧版入口 </a></span>-->
				</div>
			</div>
		</div>
	</div>
</div>
<div style="background-color: #ffffff">
	<div class="com_width1200">
		<div class="container">
			<div class="header_bottom row">
				<div class="col-md-4 head_bottom_div">
					<a href="/" target="_parent"><img src="/kxwk5_style/images/index_logo.png"></a>
				</div>
		
				<div class="col-md-6 col-md-offset-2 text-right head_bottom_div">			
					<div class="row">
						<div class="col-lg-8 col-lg-offset-2 col-md-7 col-md-offset-2">
							<div class="input-group">
					      		<div class="input-group-btn">
						        	<button type="button" class="btn btn-default dropdown-toggle search_btn_checkbox" data-toggle="dropdown" aria-haspopup="true" aria-expanded="false"><span class="index_serch_type">关键字</span><span class="caret"></span></button>
						       <ul class="dropdown-menu">
						          	<li><a class="search_type" data-type="关键字" href="#">关键字</a></li>
						          	<li><a class="search_type" data-type="全文" href="#">全文</a></li>
							        </ul>
					      		</div>
					      		<form method="post" id="searchform" action="/shop/book/Booksimple/list.do">
							      	<input name="showQueryModel.nameIsbnAuthor" type="text" class="form-control search_btn_serachbox" aria-label="..." placeholder="在全库检索">
							      	<span class="input-group-btn">
							        	<button class="btn btn-default search_btn" type="button">搜索</button>
							      	</span><span class="input-group-btn"></span>
						      	</form>
						    </div>
					    </div>
					    <div class="col-lg-2 col-md-3 bottom_pro_search">
					    	<a href="/shop/main/Login/advancedSearch.do" target="_parent"><span>高级搜索</span></a>
					    </div>
					</div>
				</div>
			</div>
		</div>
	</div>
</div>
<div class="block_for_scroll"></div>
<div style="background-color: #163273" id="nav_container">
	<div class="com_width1200">
		<div class="container">
		
			<div class="row nav_container" id="nav_container">
				
				<div id="xialamenu" class="col-md-2 col-sm-4 col-xs-5" style="padding-top: 14px;">
					<ul class="sub">
						<li id="showdiv">
							<img src="/kxwk5_style/images/index_dropdown.png" style="cursor: pointer;margin-top: -1px">
							<span class="showdiv">中图分类</span>
							<div id="nav" style="display:none;" class="wrap">
								<ul class="all-sort-list tit" style="margin:0; padding:0">

									<li class="item mod_cate">
										<h2><i class="arrow_dot fr"></i><a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6093">文学[165]</a>
										</h2>
										<div class="item-list clearfix mod_subcate">
											<div class="subitem mod_subcate_main">
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6094'">文学理论[11]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6107">文学创作论[5]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6114">各体文学理论和创作方...[3]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6125">文学评论、文学欣赏[1]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6126'">世界文学[14]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6127">作品评论和研究[9]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6145">作品集[5]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6156'">中国文学[97]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6158">文学评论和研究[11]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6164">各体文学评论和研究[23]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6206">文学史、文学思想史[9]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6208">作品集[6]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6228">诗歌、韵文[3]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6310">小说[5]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6328">报告文学[14]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6340">散文[18]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6371">儿童文学[7]</a>
													</dd>
												</dl>
											</div>
										</div>
									</li>
									<li class="item mod_cate">
										<h2><i class="arrow_dot fr"></i><a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6397">艺术[308]</a>
										</h2>
										<div class="item-list clearfix mod_subcate">
											<div class="subitem mod_subcate_main">
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6398'">艺术理论[35]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6401">艺术与其他科学的关系[3]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6402">艺术美学[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6416">造型艺术理论[31]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6422'">世界各国艺术概况[6]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6432">中国艺术[5]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6440">宗教艺术[1]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6441'">绘画[77]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6444">绘画理论[9]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6459">绘画技法[46]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6500">中国绘画作品[4]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6516">各国绘画作品[18]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6529'">书法、篆刻[17]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6530">中国书法、篆刻[15]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6559">外文书法[2]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6560'">雕塑[4]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6572">雕塑技法[2]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6598">中国雕塑作品[2]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6618'">摄影艺术[58]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6619">摄影艺术理论[4]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6629">各种摄影艺术[54]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6649'">工艺美术[42]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6661">图案学[2]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6662">中国工艺美术[33]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6711">各国工艺美术[7]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6713'">音乐[31]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6714">音乐理论[8]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6730">音乐技术理论与方法[8]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6784">器乐理论与演奏法[5]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6830">民族器乐理论和演奏法[3]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=6867">中国音乐作品[4]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7053'">舞蹈[4]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7054">舞蹈理论[1]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7147'">戏剧艺术[10]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7148">戏剧艺术理论[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7164">舞台艺术[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7190">中国戏剧艺术[8]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7233'">电影、电视艺术[16]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7234">电影、电视艺术理论[4]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7246">电影、电视艺术与技术[2]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7264">电影、电视拍摄艺术与...[2]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7272">电影、电视企业组织与...[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7286">各种电影、电视:按内...[4]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7338">电影、电视事业[1]</a>
													</dd>
												</dl>
											</div>
										</div>
									</li>
									<li class="item mod_cate">
										<h2><i class="arrow_dot fr"></i><a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7354">历史、地理[2,024]</a>
										</h2>
										<div class="item-list clearfix mod_subcate">
											<div class="subitem mod_subcate_main">
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7355'">史学理论[6]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7357">社会发展理论[3]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7361">历史研究[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7365">史学史[1]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7369'">世界史[8]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7370">通史[4]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7381">古代史(公元前40世...[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7393">近代史(1640~1...[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7397">现代史(1917年~...[2]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7402'">中国史[196]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7403">通史[34]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7422">原始社会(约60万年...[4]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7423">奴隶社会(约公元前2...[13]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7432">封建社会(公元前47...[31]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7508">半殖民地、半封建社会...[9]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7651">民族史志[29]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7656">地方史志[76]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7660'">亚洲史[9]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7661">通史[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7667">民族史志[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7668">东亚[3]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7720">东南亚[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=7879">西亚(西南亚)[3]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=8428'">欧洲史[3]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=8429">通史[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=8436">东欧、中欧[2]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=8868'">美洲史[2]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=8905">拉丁美洲[2]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9122'">传记[281]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9126">世界人物传记[3]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9135">中国人物传记[244]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9185'">文物考古[1,166]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9187">纹章学[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9188">考古方法[9]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9192">世界文物考古[3]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9193">中国文物考古[1,134]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9286'">风俗习惯[28]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9287">民俗学[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9288">世界风俗习惯[9]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9289">中国风俗习惯[17]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9309'">地理[255]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9310">地理学[51]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9325">世界地理[8]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9332">中国地理[138]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=9359">地图[2]</a>
													</dd>
												</dl>
											</div>
										</div>
									</li>
									<li class="item mod_cate">
										<h2><i class="arrow_dot fr"></i><a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51127">综合性图书[94]</a>
										</h2>
										<div class="item-list clearfix mod_subcate">
											<div class="subitem mod_subcate_main">
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51128'">丛书[4]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51129">中国丛书[4]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51151'">百科全书、类书[9]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51152">中国百科全书、类书[9]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51168'">辞典[2]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51169">中国辞典[1]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51170">各国辞典[1]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51172'">论文集、全集、选集、...[19]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51173">中国论文集、全集、选...[18]</a>
													</dd>
												</dl>
												<dl>
													<dt style="cursor:pointer;" onclick="parent.location.href='/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51184'">图书目录、文摘、索引[28]
													</dt>
													<dd>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51204">各类型目录[4]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51227">专科目录[16]</a>
														 <a target="_parent" href="/shop/book/Booksimple/list.do?showQueryModel.bookclcId=51228">文摘、索引[8]</a>
													</dd>
												</dl>
											</div>
										</div>
									</li>
								</ul>
							</div>
						</li>
					</ul>
				</div>
				<div class="scroll_nav_logo col-md-2 col-sm-4 col-xs-5">
					<img class="scroll_nav_img" src="/kxwk5_style/images/nav_logo.png" />
				</div>
				
				<div class="col-md-10 col-sm-8 col-xs-7">
					<nav class="navbar navbar-default">
	  					<div class="navbar-header">
				      		<button type="button" class="navbar-toggle collapsed" data-toggle="collapse" data-target="#bs-example-navbar-collapse-1" aria-expanded="false">
						        <span class="sr-only">Toggle navigation</span>
						        <span class="icon-bar"></span>
						        <span class="icon-bar"></span>
						        <span class="icon-bar"></span>
			     		 	</button>
				      		<!-- <a style="font-weight: bold;" class="navbar-brand" href="#">数理</a> -->
					    </div>
	 						<div class="collapse navbar-collapse" id="bs-example-navbar-collapse-1">
	     						<ul class="nav navbar-nav nav_link_container">
     									<li class="">
	     									<a class="nav_link" data-name="75e48243889111e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=75e48243889111e7a2df00163e2ed6f9">
	     										数理
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="57ed86a0889211e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=57ed86a0889211e7a2df00163e2ed6f9">
	     										化学材料
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="6a4dcb6a889211e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=6a4dcb6a889211e7a2df00163e2ed6f9">
	     										生命
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="7dee8d8a889211e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=7dee8d8a889211e7a2df00163e2ed6f9">
	     										地球
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="8ab2fb16889211e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=8ab2fb16889211e7a2df00163e2ed6f9">
	     										资源环境
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="99603fad889211e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=99603fad889211e7a2df00163e2ed6f9">
	     										农林
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="a95a1a87889211e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=a95a1a87889211e7a2df00163e2ed6f9">
	     										医药
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="c57f833d889211e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=c57f833d889211e7a2df00163e2ed6f9">
	     										信息
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="d97783da889211e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=d97783da889211e7a2df00163e2ed6f9">
	     										工程
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="e689625e889211e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=e689625e889211e7a2df00163e2ed6f9">
	     										管理
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="f4fe63b3889211e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=f4fe63b3889211e7a2df00163e2ed6f9">
	     										历史考古
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="04f8a72b889311e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=04f8a72b889311e7a2df00163e2ed6f9">
	     										经济
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="1208204e889311e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=1208204e889311e7a2df00163e2ed6f9">
	     										教育传播
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="219dfba2889311e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=219dfba2889311e7a2df00163e2ed6f9">
	     										法哲社会
	     									</a>
     									</li>
     									<li class="">
	     									<a class="nav_link" data-name="2dd606e6889311e7a2df00163e2ed6f9" href="/shop/book/Booksimple/list.do?showQueryModel.dp1Value=2dd606e6889311e7a2df00163e2ed6f9">
	     										公共阅读
	     									</a>
     									</li>
	     						<!--<li id="zhuanti_flag" class="" style="display:none"><a class="nav_link" data-name="subjectNavLink" href="#">专题</a></li> -->
						 	</ul>
			 			</div>
		 			</div>
				</div>
			</div>
		</div>
	</div>
</div>

<!-- 模态框 登录 -->
<div class="modal fade" id="loginModal" tabindex="-1" role="dialog" aria-labelledby="exampleModalLabel">
	<div class="modal-dialog" role="document">	    
		<div class="modal-content">
  			<div class="modal-header" style="text-align: center;">
    			<button type="button" class="close" data-dismiss="modal" aria-label="Close"><span aria-hidden="true">&times;</span>
    			</button>
    			<h3 class="modal-title" id="exampleModalLabel"><b>欢迎登录系统</b></h3>
  			</div>
     			<div class="modal-body">
       			<form class="form-horizontal">
	          		<div class="form-group">
	          			<div class="col-xs-12">
	            			<input type="text" class="form-control" id="txtName" name="username" placeholder="邮箱">
            			</div>
	          		</div>
          			<div class="form-group" style="margin-bottom: 4px">
	          			<div class="col-xs-12">
		            		<input type="password" class="form-control" id="txtPwd" name="password" placeholder="密码">
	            		</div>
      				</div>
      				<div class="form-group login_ip_container">
      					<div class="col-xs-12">
      						<span>本机IP:82.157.123.54</span>
     						</div>
      				</div>
      				<div class="form-group">
      					<div class="col-xs-6">
      						<input type="text" class="form-control" id="checkcode" name="checkcode" placeholder="验证码">
      					</div>
      					<div class="col-xs-6">
      						<img src="/kaptcha.jpg" id="kaptchaImage"  alt="验证码" title="点击换图片"  onclick="changeImage();"/>
      					</div>
      				</div>

      				<div class="form-group">
      					<div class="col-xs-12">
      						<button id="login_btn" type="button" class="btn">登录系统</button>
      					</div>
      				</div>
       			</form>
     			</div>
     			<div class="modal-footer">
				<div class="form-group">
					<div class="col-xs-12 third_login_title">
						<span>通过第三方账号登录</span>
					</div>
				</div>
				<div class="form-group">
					<div class="col-xs-4">
						<div class="kejiyun_container">
							<a href="/shop/main/Login/kjyTxzLogin.do" id="kjyTxz" target="_parent">
							<img src="/kxwk5_style/images/kejiyun.png">
							</a>
						</div>
					</div>
					<div class="col-xs-4">
						<!--<a href="https://graph.qq.com/oauth2.0/authorize?response_type=code&client_id=101253044&redirect_uri=http%3A%2F%2Fbook.sciencereading.cn%2Fpublic_auth.jspx&scope=all">-->
						<a href="javascript:;" onclick="qqlogin()" >
							<div class="qq_container" id="qq_container">
								<i class="fa fa-qq"></i>
							</div>
						</a>
					</div>
					<div class="col-xs-4">
						<a href="https://open.weixin.qq.com/connect/qrconnect?appid=wxbbe72137831b14a5&redirect_uri=http%3A%2F%2Fbook.sciencereading.cn%2Fshop%2Fmain%2FLogin%2FweixinLogin.do&response_type=code&scope=snsapi_login&state=STATE#wechat_redirect">
						<!--<a href="javascript:;">-->
							<div class=" weixin_container">
								<i class="fa fa-weixin"></i>
							</div>
						</a>
					</div>
				</div>
     			</div>
		</div>
		<div class="login_bottom_container">
			<div class="row">
				<div class="register_now col-xs-7">
					<a href="/shop/member/Member/create.do"><span>还没有账号?立即注册</span></a>
				</div>
				<div class="forget_password col-xs-5">
					<a href="/shop/member/Member/openFindPwd.do"><span>忘记密码?</span></a>
				</div>
			</div>
		</div>
	</div>
</div>
	<script type="text/javascript">
		jQuery('.all-sort-list > .item').hover(function(){
			var eq = jQuery('.all-sort-list > .item').index(this),				//获取当前滑过是第几个元素
				h = jQuery('.all-sort-list').offset().top,						//获取当前下拉菜单距离窗口多少像素
				s = jQuery(window).scrollTop(),									//获取游览器滚动了多少高度
				i = jQuery(this).offset().top,									//当前元素滑过距离窗口多少像素
				item = jQuery(this).children('.item-list').height(),				//下拉菜单子类内容容器的高度
				sort = jQuery('.all-sort-list').height();						//父类分类列表容器的高度
			
			if ( item < sort ){												//如果子类的高度小于父类的高度
				if ( eq == 0 ){
					jQuery(this).children('.item-list').css('top', (i-h));
				} else {
					jQuery(this).children('.item-list').css('top', (i-h)+1);
				}
			} else {
				if ( s > h ) {												//判断子类的显示位置,如果滚动的高度大于所有分类列表容器的高度
					if ( i-s > 0 ){											//则 继续判断当前滑过容器的位置 是否有一半超出窗口一半在窗口内显示的Bug,
						jQuery(this).children('.item-list').css('top', (s-h)+2 );
					} else {
						jQuery(this).children('.item-list').css('top', (s-h)-(-(i-s))+2 );
					}
				} else {
					jQuery(this).children('.item-list').css('top', 3 );
				}
			}	

			jQuery(this).addClass('hover');
			jQuery(this).children('.item-list').css('display','block');
		},function(){
			jQuery(this).removeClass('hover');
			jQuery(this).children('.item-list').css('display','none');
		});

		jQuery('.item > .item-list > .close').click(function(){
			jQuery(this).parent().parent().removeClass('hover');
			jQuery(this).parent().hide();
		});
	</script>
	<style>
		@-webkit-keyframes blink {
			0% { opacity: 1; }
			50% { opacity: 1; }
			50.01% { opacity: 0; }
			100% { opacity: 0; }
		}
		@-moz-keyframes blink {
			0% { opacity: 1; }
			50% { opacity: 1; }
			50.01% { opacity: 0; }
			100% { opacity: 0; }
		}
		@-ms-keyframes blink {
			0% { opacity: 1; }
			50% { opacity: 1; }
			50.01% { opacity: 0; }
			100% { opacity: 0; }
		}
		@-o-keyframes blink {
			0% { opacity: 1; }
			50% { opacity: 1; }
			50.01% { opacity: 0; }
			100% { opacity: 0; }
		}
		.online .fa-bell-o {
			animation: blink .75s linear infinite; 
			-webkit-animation: blink .75s linear infinite;
			-moz-animation: blink .75s linear infinite;
			-ms-animation: blink .75s linear infinite;
			-o-animation: blink .75s linear infinite;
		}
	</style>

'''

menuList1 = re.findall(r'<li class="item mod_cate">.*?<h2><i class="arrow_dot fr"></i><a target="_parent" href="/shop/book/Booksimple/list.do\?showQueryModel.bookclcId=(.*?)">(.*?)\[(.*?)\]</a>(.*?)</li>', resp, re.S)

for item1 in menuList1:
    print(f'&#12539;{fixdesc(item1[1])}(共有图书{fixdesc(item1[2])}本):{fixdesc(item1[0])}')
    menuList2 = re.findall(r'<dt style=".*?" onclick="parent.location.href=\'/shop/book/Booksimple/list.do\?showQueryModel.bookclcId=(.*?)\'">(.*?)\[(.*?)\].*?</dt>.*?<dd>(.*?)</dd>', item1[3], re.S)
    for item2 in menuList2:
        print(f'  &#12539;{fixdesc(item2[1])}(共有图书{fixdesc(item2[2])}本):{fixdesc(item2[0])}')
        menuList3 = re.findall(r'<a target="_parent" href="/shop/book/Booksimple/list.do\?showQueryModel.bookclcId=(.*?)">(.*?)\[(.*?)\].*?</a>', item2[3], re.S)
        for item3 in menuList3:
            print(f'    &#12539;{fixdesc(item3[1])}(共有图书{fixdesc(item3[2])}本):{fixdesc(item3[0])}')

发帖前要善用论坛搜索功能,那里可能会有你要找的答案或者已经有人发布过相同内容了,请勿重复发帖。

 楼主| guyinqian 发表于 2022-4-23 21:06
本程序的输出为

&#12539;文学(共有图书165本):6093
  &#12539;文学理论(共有图书11本):6094
    &#12539;文学创作论(共有图书5本):6107
    &#12539;各体文学理论和创作方...(共有图书3本):6114
    &#12539;文学评论、文学欣赏(共有图书1本):6125
  &#12539;世界文学(共有图书14本):6126
    &#12539;作品评论和研究(共有图书9本):6127
    &#12539;作品集(共有图书5本):6145
  &#12539;中国文学(共有图书97本):6156
    &#12539;文学评论和研究(共有图书11本):6158
    &#12539;各体文学评论和研究(共有图书23本):6164
    &#12539;文学史、文学思想史(共有图书9本):6206
    &#12539;作品集(共有图书6本):6208
    &#12539;诗歌、韵文(共有图书3本):6228
    &#12539;小说(共有图书5本):6310
    &#12539;报告文学(共有图书14本):6328
    &#12539;散文(共有图书18本):6340
    &#12539;儿童文学(共有图书7本):6371
&#12539;艺术(共有图书308本):6397
  &#12539;艺术理论(共有图书35本):6398
    &#12539;艺术与其他科学的关系(共有图书3本):6401
    &#12539;艺术美学(共有图书1本):6402
    &#12539;造型艺术理论(共有图书31本):6416
  &#12539;世界各国艺术概况(共有图书6本):6422
    &#12539;中国艺术(共有图书5本):6432
    &#12539;宗教艺术(共有图书1本):6440
  &#12539;绘画(共有图书77本):6441
    &#12539;绘画理论(共有图书9本):6444
    &#12539;绘画技法(共有图书46本):6459
    &#12539;中国绘画作品(共有图书4本):6500
    &#12539;各国绘画作品(共有图书18本):6516
  &#12539;书法、篆刻(共有图书17本):6529
    &#12539;中国书法、篆刻(共有图书15本):6530
    &#12539;外文书法(共有图书2本):6559
  &#12539;雕塑(共有图书4本):6560
    &#12539;雕塑技法(共有图书2本):6572
    &#12539;中国雕塑作品(共有图书2本):6598
  &#12539;摄影艺术(共有图书58本):6618
    &#12539;摄影艺术理论(共有图书4本):6619
    &#12539;各种摄影艺术(共有图书54本):6629
  &#12539;工艺美术(共有图书42本):6649
    &#12539;图案学(共有图书2本):6661
    &#12539;中国工艺美术(共有图书33本):6662
    &#12539;各国工艺美术(共有图书7本):6711
  &#12539;音乐(共有图书31本):6713
    &#12539;音乐理论(共有图书8本):6714
    &#12539;音乐技术理论与方法(共有图书8本):6730
    &#12539;器乐理论与演奏法(共有图书5本):6784
    &#12539;民族器乐理论和演奏法(共有图书3本):6830
    &#12539;中国音乐作品(共有图书4本):6867
  &#12539;舞蹈(共有图书4本):7053
    &#12539;舞蹈理论(共有图书1本):7054
  &#12539;戏剧艺术(共有图书10本):7147
    &#12539;戏剧艺术理论(共有图书1本):7148
    &#12539;舞台艺术(共有图书1本):7164
    &#12539;中国戏剧艺术(共有图书8本):7190
  &#12539;电影、电视艺术(共有图书16本):7233
    &#12539;电影、电视艺术理论(共有图书4本):7234
    &#12539;电影、电视艺术与技术(共有图书2本):7246
    &#12539;电影、电视拍摄艺术与...(共有图书2本):7264
    &#12539;电影、电视企业组织与...(共有图书1本):7272
    &#12539;各种电影、电视:按内...(共有图书4本):7286
    &#12539;电影、电视事业(共有图书1本):7338
&#12539;历史、地理(共有图书2,024本):7354
  &#12539;史学理论(共有图书6本):7355
    &#12539;社会发展理论(共有图书3本):7357
    &#12539;历史研究(共有图书1本):7361
    &#12539;史学史(共有图书1本):7365
  &#12539;世界史(共有图书8本):7369
    &#12539;通史(共有图书4本):7370
    &#12539;古代史(公元前40世...(共有图书1本):7381
    &#12539;近代史(1640~1...(共有图书1本):7393
    &#12539;现代史(1917年~...(共有图书2本):7397
  &#12539;中国史(共有图书196本):7402
    &#12539;通史(共有图书34本):7403
    &#12539;原始社会(约60万年...(共有图书4本):7422
    &#12539;奴隶社会(约公元前2...(共有图书13本):7423
    &#12539;封建社会(公元前47...(共有图书31本):7432
    &#12539;半殖民地、半封建社会...(共有图书9本):7508
    &#12539;民族史志(共有图书29本):7651
    &#12539;地方史志(共有图书76本):7656
  &#12539;亚洲史(共有图书9本):7660
    &#12539;通史(共有图书1本):7661
    &#12539;民族史志(共有图书1本):7667
    &#12539;东亚(共有图书3本):7668
    &#12539;东南亚(共有图书1本):7720
    &#12539;西亚(西南亚)(共有图书3本):7879
  &#12539;欧洲史(共有图书3本):8428
    &#12539;通史(共有图书1本):8429
    &#12539;东欧、中欧(共有图书2本):8436
  &#12539;美洲史(共有图书2本):8868
    &#12539;拉丁美洲(共有图书2本):8905
  &#12539;传记(共有图书281本):9122
    &#12539;世界人物传记(共有图书3本):9126
    &#12539;中国人物传记(共有图书244本):9135
  &#12539;文物考古(共有图书1,166本):9185
    &#12539;纹章学(共有图书1本):9187
    &#12539;考古方法(共有图书9本):9188
    &#12539;世界文物考古(共有图书3本):9192
    &#12539;中国文物考古(共有图书1,134本):9193
  &#12539;风俗习惯(共有图书28本):9286
    &#12539;民俗学(共有图书1本):9287
    &#12539;世界风俗习惯(共有图书9本):9288
    &#12539;中国风俗习惯(共有图书17本):9289
  &#12539;地理(共有图书255本):9309
    &#12539;地理学(共有图书51本):9310
    &#12539;世界地理(共有图书8本):9325
    &#12539;中国地理(共有图书138本):9332
    &#12539;地图(共有图书2本):9359
&#12539;综合性图书(共有图书94本):51127
  &#12539;丛书(共有图书4本):51128
    &#12539;中国丛书(共有图书4本):51129
  &#12539;百科全书、类书(共有图书9本):51151
    &#12539;中国百科全书、类书(共有图书9本):51152
  &#12539;辞典(共有图书2本):51168
    &#12539;中国辞典(共有图书1本):51169
    &#12539;各国辞典(共有图书1本):51170
  &#12539;论文集、全集、选集、...(共有图书19本):51172
    &#12539;中国论文集、全集、选...(共有图书18本):51173
  &#12539;图书目录、文摘、索引(共有图书28本):51184
    &#12539;各类型目录(共有图书4本):51204
    &#12539;专科目录(共有图书16本):51227
    &#12539;文摘、索引(共有图书8本):51228
莫失莫忘angle 发表于 2022-4-23 21:24
MyModHeaven 发表于 2022-4-23 22:39
MyModHeaven 发表于 2022-4-23 22:42
你看呀:

LhF0QU.jpg

from bs4 import BeautifulSoup

with open('d:/html.html', 'r', encoding='utf-8') as f:
    html = f.read()
node_li = BeautifulSoup(html, 'lxml')('li', class_='item mod_cate')
for li in node_li:
    print('='*110, '\n', li.h2.a.string)                                        # 第一分类
    for dl in li.div.div('dl'):
        cate = [i for i in dl.stripped_strings]
        print('{}\n    {}:{}'.format('-'*110, cate[0], ' '.join(cate[1:])))    # 第二分类和第三分类

免费评分

参与人数 1吾爱币 +1 热心值 +1 收起 理由
guyinqian + 1 + 1 热心回复!

查看全部评分

hackerbob 发表于 2022-4-23 22:44
应该可以,但xpath和re更容易一些,beautifulsoup没深入学
您需要登录后才可以回帖 登录 | 注册[Register]

本版积分规则

返回列表

RSS订阅|小黑屋|处罚记录|联系我们|吾爱破解 - LCG - LSG ( 京ICP备16042023号 | 京公网安备 11010502030087号 )

GMT+8, 2024-11-25 13:23

Powered by Discuz!

Copyright © 2001-2020, Tencent Cloud.

快速回复 返回顶部 返回列表