Package us.codecraft.webmagic.scheduler
Class FileCacheQueueScheduler
- java.lang.Object
-
- us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
-
- us.codecraft.webmagic.scheduler.FileCacheQueueScheduler
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
,MonitorableScheduler
,Scheduler
public class FileCacheQueueScheduler extends DuplicateRemovedScheduler implements MonitorableScheduler, java.io.Closeable
Store urls and cursor in files so that a Spider can resume the status when shutdown.- Since:
- 0.2.0
- Author:
- code4crafter@gmail.com
-
-
Field Summary
-
Fields inherited from class us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
logger
-
-
Constructor Summary
Constructors Constructor Description FileCacheQueueScheduler(java.lang.String filePath)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description void
close()
protected Request
deserializeRequest(java.lang.String line)
int
getLeftRequestsCount(Task task)
int
getTotalRequestsCount(Task task)
Request
poll(Task task)
get an url to crawlprotected void
pushWhenNoDuplicate(Request request, Task task)
protected java.lang.String
serializeRequest(Request request)
-
Methods inherited from class us.codecraft.webmagic.scheduler.DuplicateRemovedScheduler
getDuplicateRemover, noNeedToRemoveDuplicate, push, setDuplicateRemover, shouldReserved
-
-
-
-
Method Detail
-
close
public void close() throws java.io.IOException
- Specified by:
close
in interfacejava.lang.AutoCloseable
- Specified by:
close
in interfacejava.io.Closeable
- Throws:
java.io.IOException
-
pushWhenNoDuplicate
protected void pushWhenNoDuplicate(Request request, Task task)
- Overrides:
pushWhenNoDuplicate
in classDuplicateRemovedScheduler
-
poll
public Request poll(Task task)
Description copied from interface:Scheduler
get an url to crawl
-
getLeftRequestsCount
public int getLeftRequestsCount(Task task)
- Specified by:
getLeftRequestsCount
in interfaceMonitorableScheduler
-
getTotalRequestsCount
public int getTotalRequestsCount(Task task)
- Specified by:
getTotalRequestsCount
in interfaceMonitorableScheduler
-
serializeRequest
protected java.lang.String serializeRequest(Request request)
-
deserializeRequest
protected Request deserializeRequest(java.lang.String line)
-
-