Package | Description |
---|---|
us.codecraft.webmagic |
Main class "Spider" and models.
|
us.codecraft.webmagic.handler | |
us.codecraft.webmagic.pipeline |
Pipeline is the persistent and offline process part of crawler.
|
us.codecraft.webmagic.samples.pipeline |
Modifier and Type | Method and Description |
---|---|
ResultItems |
Page.getResultItems() |
<T> ResultItems |
ResultItems.put(String key,
T value) |
ResultItems |
ResultItems.setRequest(Request request) |
ResultItems |
ResultItems.setSkip(boolean skip)
Set whether to skip the result.
Result which is skipped will not be processed by Pipeline. |
Modifier and Type | Method and Description |
---|---|
void |
CompositePipeline.process(ResultItems resultItems,
Task task) |
RequestMatcher.MatchOther |
SubPipeline.processResult(ResultItems resultItems,
Task task)
process the page, extract urls to fetch, extract the data and store
|
Modifier and Type | Method and Description |
---|---|
List<ResultItems> |
ResultItemsCollectorPipeline.getCollected() |
Modifier and Type | Method and Description |
---|---|
void |
MultiPagePipeline.process(ResultItems resultItems,
Task task) |
void |
JsonFilePipeline.process(ResultItems resultItems,
Task task) |
void |
ResultItemsCollectorPipeline.process(ResultItems resultItems,
Task task) |
void |
Pipeline.process(ResultItems resultItems,
Task task)
Process extracted results.
|
void |
FilePipeline.process(ResultItems resultItems,
Task task) |
void |
ConsolePipeline.process(ResultItems resultItems,
Task task) |
Modifier and Type | Method and Description |
---|---|
void |
OneFilePipeline.process(ResultItems resultItems,
Task task) |
Copyright © 2017. All rights reserved.