| Package | Description |
|---|---|
| us.codecraft.webmagic.model |
Page model and annotations used to customize a crawler.
|
| us.codecraft.webmagic.pipeline |
Pipeline is the persistent and offline process part of crawler.
|
| Modifier and Type | Class and Description |
|---|---|
class |
ConsolePageModelPipeline
Print page model in console.
Usually used in test. |
| Modifier and Type | Method and Description |
|---|---|
OOSpider |
OOSpider.addPageModel(PageModelPipeline pageModelPipeline,
Class... pageModels) |
static OOSpider |
OOSpider.create(Site site,
PageModelPipeline pageModelPipeline,
Class... pageModels) |
| Constructor and Description |
|---|
OOSpider(Site site,
PageModelPipeline pageModelPipeline,
Class... pageModels)
create a spider
|
| Modifier and Type | Class and Description |
|---|---|
class |
CollectorPageModelPipeline<T> |
class |
FilePageModelPipeline
Store results objects (page models) to files in plain format.
Use model.getKey() as file name if the model implements HasKey. Otherwise use SHA1 as file name. |
class |
JsonFilePageModelPipeline
Store results objects (page models) to files in JSON format.
Use model.getKey() as file name if the model implements HasKey. Otherwise use SHA1 as file name. |
Copyright © 2017. All rights reserved.