Interface Downloader

  • All Known Implementing Classes:
    AbstractDownloader, HttpClientDownloader, PhantomJSDownloader, SeleniumDownloader

    public interface Downloader
    Downloader is the part that downloads web pages and store in Page object.
    Downloader has setThread(int) method because downloader is always the bottleneck of a crawler, there are always some mechanisms such as pooling in downloader, and pool size is related to thread numbers.
    Since:
    0.1.0
    Author:
    code4crafter@gmail.com
    • Method Detail

      • download

        Page download​(Request request,
                      Task task)
        Downloads web pages and store in Page object.
        Parameters:
        request - request
        task - task
        Returns:
        page
      • setThread

        void setThread​(int threadNum)
        Tell the downloader how many threads the spider used.
        Parameters:
        threadNum - number of threads