See: Description
Interface | Description |
---|---|
Downloader |
Downloader is the part that downloads web pages and store in Page object.
|
Class | Description |
---|---|
AbstractDownloader |
Base class of downloader with some common methods.
|
CustomRedirectStrategy |
支持post 302跳转策略实现类
HttpClient默认跳转:httpClientBuilder.setRedirectStrategy(new LaxRedirectStrategy());
上述代码在post/redirect/post这种情况下不会传递原有请求的数据信息。所以参考了下SeimiCrawler这个项目的重定向策略。
原代码地址:https://github.com/zhegexiaohuozi/SeimiCrawler/blob/master/project/src/main/java/cn/wanghaomiao/seimi/http/hc/SeimiRedirectStrategy.java
|
HttpClientDownloader |
The http downloader based on HttpClient.
|
HttpClientGenerator | |
HttpClientRequestContext | |
HttpUriRequestConverter | |
PhantomJSDownloader |
this downloader is used to download pages which need to render the javascript
|
Copyright © 2017. All rights reserved.