Package us.codecraft.webmagic.scheduler
Class RedisScheduler
- java.lang.Object
-
- us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
-
- us.codecraft.webmagic.scheduler.RedisScheduler
-
- All Implemented Interfaces:
DuplicateRemover
,MonitorableScheduler
,Scheduler
- Direct Known Subclasses:
RedisPriorityScheduler
public class RedisScheduler extends DuplicateRemovedScheduler implements MonitorableScheduler, DuplicateRemover
Use Redis as url scheduler for distributed crawlers.- Since:
- 0.2.0
- Author:
- code4crafter@gmail.com
-
-
Field Summary
Fields Modifier and Type Field Description protected redis.clients.jedis.JedisPool
pool
-
Fields inherited from class us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
logger
-
-
Constructor Summary
Constructors Constructor Description RedisScheduler(java.lang.String host)
RedisScheduler(redis.clients.jedis.JedisPool pool)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected java.lang.String
getItemKey(Task task)
int
getLeftRequestsCount(Task task)
protected java.lang.String
getQueueKey(Task task)
protected java.lang.String
getSetKey(Task task)
int
getTotalRequestsCount(Task task)
Get TotalRequestsCount for monitor.boolean
isDuplicate(Request request, Task task)
Check whether the request is duplicate.Request
poll(Task task)
get an url to crawlprotected void
pushWhenNoDuplicate(Request request, Task task)
void
resetDuplicateCheck(Task task)
Reset duplicate check.-
Methods inherited from class us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
getDuplicateRemover, noNeedToRemoveDuplicate, push, setDuplicateRemover, shouldReserved
-
-
-
-
Method Detail
-
resetDuplicateCheck
public void resetDuplicateCheck(Task task)
Description copied from interface:DuplicateRemover
Reset duplicate check.- Specified by:
resetDuplicateCheck
in interfaceDuplicateRemover
- Parameters:
task
- task
-
isDuplicate
public boolean isDuplicate(Request request, Task task)
Description copied from interface:DuplicateRemover
Check whether the request is duplicate.- Specified by:
isDuplicate
in interfaceDuplicateRemover
- Parameters:
request
- requesttask
- task- Returns:
- true if is duplicate
-
pushWhenNoDuplicate
protected void pushWhenNoDuplicate(Request request, Task task)
- Overrides:
pushWhenNoDuplicate
in classDuplicateRemovedScheduler
-
poll
public Request poll(Task task)
Description copied from interface:Scheduler
get an url to crawl
-
getSetKey
protected java.lang.String getSetKey(Task task)
-
getQueueKey
protected java.lang.String getQueueKey(Task task)
-
getItemKey
protected java.lang.String getItemKey(Task task)
-
getLeftRequestsCount
public int getLeftRequestsCount(Task task)
- Specified by:
getLeftRequestsCount
in interfaceMonitorableScheduler
-
getTotalRequestsCount
public int getTotalRequestsCount(Task task)
Description copied from interface:DuplicateRemover
Get TotalRequestsCount for monitor.- Specified by:
getTotalRequestsCount
in interfaceDuplicateRemover
- Specified by:
getTotalRequestsCount
in interfaceMonitorableScheduler
- Parameters:
task
- task- Returns:
- number of total request
-
-