当前位置 : 主页 > 编程语言 > python >

Python 爬虫 爬取爱奇艺VIP视频

来源:互联网 收集:自由互联 发布时间:2022-09-02
一、第三方库 requests pip install requests 发送请求 访问网站 tqdm pip install tqdm 进度条 模块 二、开发环境 版 本: python 3.8 编辑器:pycharm 2021.2 三、模块安装问题 win + R 输入cmd 输入安装命令

一、第三方库

   requests >>> pip install requests   发送请求 访问网站

   tqdm >>> pip install tqdm    进度条 模块

二、开发环境

    版 本: python  3.8

    编辑器:pycharm 2021.2

三、模块安装问题

win + R 输入cmd 输入安装命令 pip install 模块名 (如果你觉得安装速度比较慢, 你可以切换国内镜像源)

模块安装问题:

 - 如何安装python第三方模块:

     1. win + R 输入 cmd 点击确定, 输入安装命令 pip install 模块名 (pip install requests) 回车

     2. 在pycharm中点击Terminal(终端) 输入安装命令

 - 安装失败原因:

     - 失败一: pip 不是内部命令

         解决方法: 设置环境变量

     - 失败二: 出现大量报红 (read time out)

         解决方法: 因为是网络链接超时,  需要切换镜像源

             清华:https://pypi.tuna.tsinghua.edu.cn/simple

             阿里云:https://mirrors.aliyun.com/pypi/simple/

             中国科技大学 https://pypi.mirrors.ustc.edu.cn/simple/

             华中理工大学:https://pypi.hustunique.com/

             山东理工大学:https://pypi.sdutlinux.org/

             豆瓣:https://pypi.douban.com/simple/

             例如:pip3 install -i https://pypi.doubanio.com/simple/ 模块名

     - 失败三: cmd里面显示已经安装过了, 或者安装成功了, 但是在pycharm里面还是无法导入

         解决方法: 可能安装了多个python版本 (anaconda 或者 python 安装一个即可) 卸载一个就好

                 或者你pycharm里面python解释器没有设置好

四、配置pycharm里面的python解释器

1. 选择file(文件) >>> setting(设置) >>> Project(项目) >>> python interpreter(python解释器)

      2. 点击齿轮, 选择add

      3. 添加python安装路径

五、pycharm如何安装插件

1. 选择file(文件) >>> setting(设置) >>> Plugins(插件)

2. 点击 Marketplace  输入想要安装的插件名字 比如:翻译插件 输入 translation / 汉化插件 输入 Chinese

3. 选择相应的插件点击 install(安装) 即可

4. 安装成功之后 是会弹出 重启pycharm的选项 点击确定, 重启即可生效

六、爬虫基本思路

爬视频

m3u8: 视频流格式

   ts片段 网站链接 总和 m3u8 网站链接(所有的ts片段链接)

   省流

   mp4  访问一个网站 视频网站

   解放 服务器压力

实现一个视频爬虫

   分析数据来源(m3u8网站链接)

​​https://cache.video.iqiyi.com/dash?tvid=4789431465776600&bid=600&vid=40468e42c7f89049c1ead0067adbf6bf&src=01010031010000000000&vt=0&rs=1&uid=1637120337&ori=pcw&ps=1&k_uid=fb211523bdc556b600a53cb72de24305&pt=0&d=0&s=&lid=&cf=&ct=&authKey=0c3bcedc75453c44c1e3bcb02e54d0ad&k_tag=1&dfp=a1691ca7d5a6964b49995377607b0302249996fd6c8dca1ebc59539a4f410e402d&locale=zh_cn&prio=%7B%22ff%22%3A%22f4v%22%2C%22code%22%3A2%7D&pck=05fA3TTaBsyaafH2gNMCU7rFlsEK6qA9zeYPH8bDQN9auzFUsVMkYEfSm2Em1CTE4oim3b7&k_err_retries=0&up=&qd_v=2&tm=1659016089991&qdy=a&qds=0&k_ft1=706436220846084&k_ft4=1161084347621380&k_ft5=262145&bop=%7B%22version%22%3A%2210.0%22%2C%22dfp%22%3A%22a1691ca7d5a6964b49995377607b0302249996fd6c8dca1ebc59539a4f410e402d%22%7D&ut=16&vf=56a91d2284dad8c1d32e65e147535c7d​​

实现代码:

   1. 发送请求 (访问网站)

   2. 获取数据

   3. 解析数据

七、完整代码

import requests
import re
from tqdm import tqdm

headers = {
'cookie': 'QC005=fb211523bdc556b600a53cb72de24305; QC006=e0mhjuh843mffyx4kqdsf1po; QP0030=1; TQC030=1; T00404=d229739aacf304df0bbde71c6736c979; QC173=0; QP0034=%7B%22v%22%3A1%2C%22dm%22%3A%7B%22wv%22%3A1%7D%2C%22m%22%3A%7B%22wm-vp9%22%3A1%2C%22wm-av1%22%3A1%7D%7D; QC008=1658151456.1658151456.1659015716.2; nu=0; P00004=.1659015719.b9ba4b25bc; QC160=%7B%22type%22%3A2%2C%22conformLoginType%22%3A0%7D; QY_PUSHMSG_ID=fb211523bdc556b600a53cb72de24305; QYABEX={"mergedAbtest":"4269_B,3075_A,4580_A,1550_B,1707_B","PCW_1_LoginCash":{"value":"1","abtest":"4269_B"},"PCW_1_new_player":{"value":"0","abtest":"3075_A"},"PCW_1_qyhome_recommend_sources":{"value":"0","abtest":"4580_A"},"pcw_home_hover":{"value":"1","abtest":"1550_B"},"PCW-Home-List":{"value":"1","abtest":"1707_B"}}; QP0033=1; T00700=EgcI9L-tIRABEgcI58DtIRABEgcIq8HtIRABEgcIrcHtIRAB; QP0037=60; P00001=05fA3TTaBsyaafH2gNMCU7rFlsEK6qA9zeYPH8bDQN9auzFUsVMkYEfSm2Em1CTE4oim3b7; P00007=05fA3TTaBsyaafH2gNMCU7rFlsEK6qA9zeYPH8bDQN9auzFUsVMkYEfSm2Em1CTE4oim3b7; P00003=1637120337; P00002=%7B%22uid%22%3A1637120337%2C%22pru%22%3A1637120337%2C%22user_name%22%3A%22199****7649%22%2C%22nickname%22%3A%22%5Cu5bcc%5Cu58eb%5Cu5c71%5Cu4e0b2010duo%22%2C%22pnickname%22%3A%22%5Cu5bcc%5Cu58eb%5Cu5c71%5Cu4e0b2010duo%22%2C%22type%22%3A11%2C%22email%22%3A%22%22%7D; P00010=1637120337; P01010=1659024000; P00PRU=1637120337; QC170=1; QC179=%7B%22vipTypes%22%3A%2216%22%2C%22userIcon%22%3A%22%2F%2Fimg7.iqiyipic.com%2Fpassport%2F20200101%2F90%2F90%2Fpassport_1637120337_157780421165796_130_130.jpg%22%2C%22iconPendant%22%3A%22%22%2C%22uid%22%3A1637120337%2C%22bannedVip%22%3Afalse%2C%22allVip%22%3Atrue%7D; QC175=%7B%22upd%22%3Atrue%2C%22ct%22%3A1659016055538%7D; QP0013=16; QC163=1; QP0027=5; __dfp=a1691ca7d5a6964b49995377607b0302249996fd6c8dca1ebc59539a4f410e402d@1659447456189@1658151457189; QY00001=1637120337; QP0025=1; QP0035=5; QP0036=2022728%7C80.672; QC007=https%25252525252525252525253A%25252525252525252525252F%25252525252525252525252Fwww.baidu.com%25252525252525252525252Flink%25252525252525252525253Furl%25252525252525252525253DMI7zNhGN69M3vGsOZZ2uvd0jrUY1WRmmsuYe1yqoYD3%252525252525252525252526wd%25252525252525252525253D%252525252525252525252526eqid%25252525252525252525253Dc8f720430003e4cb0000000662e29222; QC010=160607417; IMS=IggQABj_5IqXBioqCiA4ODgxNzY2YTAyOWZlMzc2ZDBhNDRkMzQzNGZiOTM1NBAAIgAoSjAFciQKIDg4ODE3NjZhMDI5ZmUzNzZkMGE0NGQzNDM0ZmI5MzU0EACCAQCKASQKIgogODg4MTc2NmEwMjlmZTM3NmQwYTQ0ZDM0MzRmYjkzNTQ; QC159=%7B%22color%22%3A%22FFFFFF%22%2C%22channelConfig%22%3A0%2C%22hideRoleTip%22%3A1%2C%22isOpen%22%3A1%2C%22speed%22%3A10%2C%22density%22%3A40%2C%22opacity%22%3A86%2C%22isFilterColorFont%22%3A1%2C%22isOpenMask%22%3A0%2C%22proofShield%22%3A0%2C%22forcedFontSize%22%3A24%2C%22isFilterImage%22%3A1%2C%22defaultSwitch%22%3A0%2C%22hadTip%22%3A1%2C%22clickRole%22%3A0%7D',
'origin': 'https://www.iqiyi.com',
'referer': 'https://www.iqiyi.com/v_1ced056f25w.html?vfrm=pcw_home&vfrmblk=712211_dianying&vfrmrst=712211_dianying_float_video_area22',
'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/101.0.4951.67 Safari/537.36',
}
url = 'https://cache.video.iqiyi.com/dash?tvid=4789431465776600&bid=600&vid=40468e42c7f89049c1ead0067adbf6bf&src=01010031010000000000&vt=0&rs=1&uid=1637120337&ori=pcw&ps=1&k_uid=fb211523bdc556b600a53cb72de24305&pt=0&d=0&s=&lid=&cf=&ct=&authKey=0c3bcedc75453c44c1e3bcb02e54d0ad&k_tag=1&dfp=a1691ca7d5a6964b49995377607b0302249996fd6c8dca1ebc59539a4f410e402d&locale=zh_cn&prio=%7B%22ff%22%3A%22f4v%22%2C%22code%22%3A2%7D&pck=05fA3TTaBsyaafH2gNMCU7rFlsEK6qA9zeYPH8bDQN9auzFUsVMkYEfSm2Em1CTE4oim3b7&k_err_retries=0&up=&qd_v=2&tm=1659016089991&qdy=a&qds=0&k_ft1=706436220846084&k_ft4=1161084347621380&k_ft5=262145&bop=%7B%22version%22%3A%2210.0%22%2C%22dfp%22%3A%22a1691ca7d5a6964b49995377607b0302249996fd6c8dca1ebc59539a4f410e402d%22%7D&ut=16&vf=56a91d2284dad8c1d32e65e147535c7d'
response = requests.get(url=url, headers=headers)
json_data = response.json()
m3u8 = json_data['data']['program']['video'][1]['m3u8']
ts_list = re.sub('#E.*', '', m3u8)
ts_list = ts_list.split()
for ts in tqdm(ts_list):
ts_data = requests.get(ts).content
with open('远山淡影.mp4', mode='ab') as f:
f.write(ts_data)

Python 爬虫 爬取爱奇艺VIP视频_m3u8

上一篇:Python 爬虫 爬取腾讯VIP视频
下一篇:没有了
网友评论