怎么利用Scrapy框架登录网站?相信很多没有经验的人对此束手无策,为此本文总结了问题出现的原因和解决方法,通过这篇文章希望你能解决这个问题。
专业从事成都网站设计、成都网站建设,高端网站制作设计,微信小程序开发,网站推广的成都做网站的公司。优秀技术团队竭力真诚服务,采用H5技术+CSS3前端渲染技术,成都响应式网站建设,让网站在手机、平板、PC、微信下都能呈现。建站过程建立专项小组,与您实时在线互动,随时提供解决方案,畅聊想法和感受。一、使用cookies登录网站
import scrapy class LoginSpider(scrapy.Spider): name = 'login' allowed_domains = ['xxx.com'] start_urls = ['https://www.xxx.com/xx/'] cookies = "" def start_requests(self): for url in self.start_urls: yield scrapy.Request(url, cookies=self.cookies, callback=self.parse) def parse(self, response): with open("01login.html", "wb") as f: f.write(response.body)
二、发送post请求登录, 要手动解析网页获取登录参数
import scrapy class LoginSpider(scrapy.Spider): name='login_code' allowed_domains = ['xxx.com'] #1. 登录页面 start_urls = ['https://www.xxx.com/login/'] def parse(self, response): #2. 代码登录 login_url='https://www.xxx.com/login' formdata={ "username":"xxx", "pwd":"xxx", "formhash":response.xpath("//input[@id='formhash']/@value").extract_first(), "backurl":response.xpath("//input[@id='backurl']/@value").extract_first() } #3. 发送登录请求post yield scrapy.FormRequest(login_url, formdata=formdata, callback=self.parse_login) def parse_login(self, response): #4.访问目标页面 member_url="https://www.xxx.com/member" yield scrapy.Request(member_url, callback=self.parse_member) def parse_member(self, response): with open("02login.html",'wb') as f: f.write(response.body)
三、发送post请求登录, 自动解析网页获取登录参数
import scrapy class LoginSpider(scrapy.Spider): name='login_code2' allowed_domains = ['xxx.com'] #1. 登录页面 start_urls = ['https://www.xxx.com/login/'] def parse(self, response): #2. 代码登录 login_url='https://www.xxx.com/login' formdata={ "username":"xxx", "pwd":"xxx" } #3. 发送登录请求post yield scrapy.FormRequest.from_response( response, formxpath="//*[@id='login_pc']", formdata=formdata, method="POST", #覆盖之前的get请求 callback=self.parse_login ) def parse_login(self, response): #4.访问目标页面 member_url="https://www.xxx.com/member" yield scrapy.Request(member_url, callback=self.parse_member) def parse_member(self, response): with open("03login.html",'wb') as f: f.write(response.body)
看完上述内容,你们掌握怎么利用Scrapy框架登录网站的方法了吗?如果还想学到更多技能或想了解更多相关内容,欢迎关注创新互联行业资讯频道,感谢各位的阅读!