使用爬虫爬取新闻网站标题

代码

from bs4 import BeautifulSoup
import requests

headers = {
    "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8,application/signed-exchange;v=b3",
    "Accept-Language": "zh-CN,zh;q=0.9,en;q=0.8,en-US;q=0.7",
    "Host": "httpbin.org",
    "Upgrade-Insecure-Requests": "1",
    "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/77.0.3865.90 Safari/537.36"
}

def Title(url):
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.text, 'lxml')
    h4 = soup.find_all('h4', class_='news__item-title mt0')
    for title in h4:
        print(title.get_text())

运行

文章名: 《使用爬虫爬取新闻网站标题》

本文链接:https://lula.fun/1037.html

除特别注明外,文章均由 Lula(噜啦) 原创

 原创文章 转载时请注明 出处 以及文章链接
最后修改:2019 年 10 月 09 日 08 : 30 PM

发表评论