| Package | Description |
|---|---|
| us.codecraft.webmagic |
Main class "Spider" and models.
|
| us.codecraft.webmagic.handler | |
| us.codecraft.webmagic.pipeline |
Pipeline is the persistent and offline process part of crawler.
|
| us.codecraft.webmagic.samples.pipeline |
| Modifier and Type | Field and Description |
|---|---|
protected List<Pipeline> |
Spider.pipelines |
| Modifier and Type | Method and Description |
|---|---|
Spider |
Spider.addPipeline(Pipeline pipeline)
add a pipeline for Spider
|
Spider |
Spider.pipeline(Pipeline pipeline)
Deprecated.
|
| Modifier and Type | Method and Description |
|---|---|
Spider |
Spider.setPipelines(List<Pipeline> pipelines)
set pipelines for Spider
|
| Modifier and Type | Class and Description |
|---|---|
class |
CompositePipeline |
| Modifier and Type | Interface and Description |
|---|---|
interface |
CollectorPipeline<T>
Pipeline that can collect and store results.
|
| Modifier and Type | Class and Description |
|---|---|
class |
ConsolePipeline
Write results in console.
Usually used in test. |
class |
FilePipeline
Store results in files.
|
class |
JsonFilePipeline
Store results to files in JSON format.
|
class |
MultiPagePipeline
A pipeline combines the result in more than one page together.
Used for news and articles containing more than one web page. |
class |
ResultItemsCollectorPipeline |
| Modifier and Type | Class and Description |
|---|---|
class |
OneFilePipeline |
Copyright © 2017. All rights reserved.