Package us.codecraft.webmagic.downloader
Downloader is the part that downloads web pages and store in Page object.
-
Interface Summary Interface Description Downloader Downloader is the part that downloads web pages and store in Page object. -
Class Summary Class Description AbstractDownloader Base class of downloader with some common methods.CustomRedirectStrategy 支持post 302跳转策略实现类 HttpClient默认跳转:httpClientBuilder.setRedirectStrategy(new LaxRedirectStrategy()); 上述代码在post/redirect/post这种情况下不会传递原有请求的数据信息。所以参考了下SeimiCrawler这个项目的重定向策略。 原代码地址:https://github.com/zhegexiaohuozi/SeimiCrawler/blob/master/project/src/main/java/cn/wanghaomiao/seimi/http/hc/SeimiRedirectStrategy.javaHttpClientDownloader The http downloader based on HttpClient.HttpClientGenerator HttpClientRequestContext HttpUriRequestConverter PhantomJSDownloader this downloader is used to download pages which need to render the javascript