Package us.codecraft.webmagic.scheduler
Class FileCacheQueueScheduler
java.lang.Object
us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
us.codecraft.webmagic.scheduler.FileCacheQueueScheduler
- All Implemented Interfaces:
Closeable
,AutoCloseable
,us.codecraft.webmagic.scheduler.MonitorableScheduler
,us.codecraft.webmagic.scheduler.Scheduler
public class FileCacheQueueScheduler
extends us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
implements us.codecraft.webmagic.scheduler.MonitorableScheduler, Closeable
Store urls and cursor in files so that a Spider can resume the status when shutdown.
- Since:
- 0.2.0
- Author:
- code4crafter@gmail.com
-
Field Summary
Fields inherited from class us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
logger
-
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
protected us.codecraft.webmagic.Request
deserializeRequest
(String line) int
getLeftRequestsCount
(us.codecraft.webmagic.Task task) int
getTotalRequestsCount
(us.codecraft.webmagic.Task task) us.codecraft.webmagic.Request
poll
(us.codecraft.webmagic.Task task) protected void
pushWhenNoDuplicate
(us.codecraft.webmagic.Request request, us.codecraft.webmagic.Task task) protected String
serializeRequest
(us.codecraft.webmagic.Request request) Methods inherited from class us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
getDuplicateRemover, noNeedToRemoveDuplicate, push, setDuplicateRemover, shouldReserved
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
Methods inherited from interface us.codecraft.webmagic.scheduler.Scheduler
push
-
Constructor Details
-
FileCacheQueueScheduler
-
-
Method Details
-
close
- Specified by:
close
in interfaceAutoCloseable
- Specified by:
close
in interfaceCloseable
- Throws:
IOException
-
pushWhenNoDuplicate
protected void pushWhenNoDuplicate(us.codecraft.webmagic.Request request, us.codecraft.webmagic.Task task) - Overrides:
pushWhenNoDuplicate
in classus.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
-
poll
public us.codecraft.webmagic.Request poll(us.codecraft.webmagic.Task task) - Specified by:
poll
in interfaceus.codecraft.webmagic.scheduler.Scheduler
-
getLeftRequestsCount
public int getLeftRequestsCount(us.codecraft.webmagic.Task task) - Specified by:
getLeftRequestsCount
in interfaceus.codecraft.webmagic.scheduler.MonitorableScheduler
-
getTotalRequestsCount
public int getTotalRequestsCount(us.codecraft.webmagic.Task task) - Specified by:
getTotalRequestsCount
in interfaceus.codecraft.webmagic.scheduler.MonitorableScheduler
-
serializeRequest
-
deserializeRequest
-