这样做: from urllib.parse import urljoinurljoin('https://site/folder', 'page') 返回https:// site / page.然后就可以了,我可以追加一个/.但是当我的变量已经/和我追加另一个时,我得到了双杠: urljoin('h
from urllib.parse import urljoin urljoin('https://site/folder', 'page')
返回https:// site / page.然后就可以了,我可以追加一个/.但是当我的变量已经/和我追加另一个时,我得到了双杠:
urljoin('https://site/folder//', 'page') >>> 'https://site/folder//page'
在加入网址时,urljoin允许这个双杠//不会错!
如何加入这样的URL部分列表:
urljoin('https://site/folder', 'page', 'otherpage' ) > https://site/folder/page/otherpage urljoin('https://site/folder', 'page', 'otherpage.jsf' ) > https://site/folder/page/otherpage.jsf urljoin('https://site/folder/' , 'page.htm', ) > https://site/folder/page.htm urljoin('https://site/folder//', '/page', '///otherpage' ) > https://site/folder/page/otherpage urljoin('https://site/folder//', '//page/', '//otherpage.php' ) > https://site/folder/page/otherpage.php urljoin('https://site/folder//', 'page', '/otherpage////' ) > https://site/folder/page/otherpage我确信有不同的方法可以做到这一点
from urllib.parse import urljoin from functools import reduce # python3 def clean_url(url): return url.strip('/') + '/' def joinurllist(urls): return reduce(urljoin, map(clean_url, urls)) joinurllist(['https://site/folder//', 'page', '///otherpage/'])