Class RedisScheduler

  • All Implemented Interfaces:
    us.codecraft.webmagic.scheduler.component.DuplicateRemover, us.codecraft.webmagic.scheduler.MonitorableScheduler, us.codecraft.webmagic.scheduler.Scheduler
    Direct Known Subclasses:
    RedisPriorityScheduler

    public class RedisScheduler
    extends us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
    implements us.codecraft.webmagic.scheduler.MonitorableScheduler, us.codecraft.webmagic.scheduler.component.DuplicateRemover
    Use Redis as url scheduler for distributed crawlers.
    Since:
    0.2.0
    Author:
    code4crafter@gmail.com
    • Field Summary

      Fields 
      Modifier and Type Field Description
      protected redis.clients.jedis.JedisPool pool  
      • Fields inherited from class us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler

        logger
    • Constructor Summary

      Constructors 
      Constructor Description
      RedisScheduler​(java.lang.String host)  
      RedisScheduler​(redis.clients.jedis.JedisPool pool)  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      protected java.lang.String getItemKey​(us.codecraft.webmagic.Task task)  
      int getLeftRequestsCount​(us.codecraft.webmagic.Task task)  
      protected java.lang.String getQueueKey​(us.codecraft.webmagic.Task task)  
      protected java.lang.String getSetKey​(us.codecraft.webmagic.Task task)  
      int getTotalRequestsCount​(us.codecraft.webmagic.Task task)  
      boolean isDuplicate​(us.codecraft.webmagic.Request request, us.codecraft.webmagic.Task task)  
      us.codecraft.webmagic.Request poll​(us.codecraft.webmagic.Task task)  
      protected void pushWhenNoDuplicate​(us.codecraft.webmagic.Request request, us.codecraft.webmagic.Task task)  
      void resetDuplicateCheck​(us.codecraft.webmagic.Task task)  
      • Methods inherited from class us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler

        getDuplicateRemover, noNeedToRemoveDuplicate, push, setDuplicateRemover, shouldReserved
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
      • Methods inherited from interface us.codecraft.webmagic.scheduler.Scheduler

        push
    • Field Detail

      • pool

        protected redis.clients.jedis.JedisPool pool
    • Constructor Detail

      • RedisScheduler

        public RedisScheduler​(java.lang.String host)
      • RedisScheduler

        public RedisScheduler​(redis.clients.jedis.JedisPool pool)
    • Method Detail

      • resetDuplicateCheck

        public void resetDuplicateCheck​(us.codecraft.webmagic.Task task)
        Specified by:
        resetDuplicateCheck in interface us.codecraft.webmagic.scheduler.component.DuplicateRemover
      • isDuplicate

        public boolean isDuplicate​(us.codecraft.webmagic.Request request,
                                   us.codecraft.webmagic.Task task)
        Specified by:
        isDuplicate in interface us.codecraft.webmagic.scheduler.component.DuplicateRemover
      • pushWhenNoDuplicate

        protected void pushWhenNoDuplicate​(us.codecraft.webmagic.Request request,
                                           us.codecraft.webmagic.Task task)
        Overrides:
        pushWhenNoDuplicate in class us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
      • poll

        public us.codecraft.webmagic.Request poll​(us.codecraft.webmagic.Task task)
        Specified by:
        poll in interface us.codecraft.webmagic.scheduler.Scheduler
      • getSetKey

        protected java.lang.String getSetKey​(us.codecraft.webmagic.Task task)
      • getQueueKey

        protected java.lang.String getQueueKey​(us.codecraft.webmagic.Task task)
      • getItemKey

        protected java.lang.String getItemKey​(us.codecraft.webmagic.Task task)
      • getLeftRequestsCount

        public int getLeftRequestsCount​(us.codecraft.webmagic.Task task)
        Specified by:
        getLeftRequestsCount in interface us.codecraft.webmagic.scheduler.MonitorableScheduler
      • getTotalRequestsCount

        public int getTotalRequestsCount​(us.codecraft.webmagic.Task task)
        Specified by:
        getTotalRequestsCount in interface us.codecraft.webmagic.scheduler.component.DuplicateRemover
        Specified by:
        getTotalRequestsCount in interface us.codecraft.webmagic.scheduler.MonitorableScheduler