Uses of Package us.codecraft.webmagic (webmagic-parent 0.7.3 API)

Packages that use us.codecraft.webmagic
Package	Description
us.codecraft.webmagic	Main class "Spider" and models.
us.codecraft.webmagic.configurable
us.codecraft.webmagic.downloader	Downloader is the part that downloads web pages and store in Page object.
us.codecraft.webmagic.downloader.selenium
us.codecraft.webmagic.example
us.codecraft.webmagic.handler
us.codecraft.webmagic.model	Page model and annotations used to customize a crawler.
us.codecraft.webmagic.model.samples
us.codecraft.webmagic.monitor
us.codecraft.webmagic.pipeline	Pipeline is the persistent and offline process part of crawler.
us.codecraft.webmagic.processor	PageProcessor custom part of a crawler for specific site.
us.codecraft.webmagic.processor.example
us.codecraft.webmagic.proxy
us.codecraft.webmagic.samples
us.codecraft.webmagic.samples.pipeline
us.codecraft.webmagic.samples.scheduler
us.codecraft.webmagic.scheduler	Scheduler is the part of url management.
us.codecraft.webmagic.scheduler.component	Component of scheduler.
us.codecraft.webmagic.scripts
us.codecraft.webmagic.utils	Static utils of webmagic.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic
Class and Description
MultiPageModel Extract an object of more than one pages, such as news and articles.
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Request Object contains url to crawl. It contains some additional information.
ResultItems Object contains extract results. It is contained in Page and will be processed in pipeline.
Site Object contains setting for crawler.
Spider Entrance of a crawler. A spider contains four modules: Downloader, Scheduler, PageProcessor and Pipeline. Every module is a field of Spider.
Spider.Status
SpiderListener Listener of Spider on page processing.
Task Interface for identifying different tasks.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.configurable
Class and Description
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Site Object contains setting for crawler.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.downloader
Class and Description
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Request Object contains url to crawl. It contains some additional information.
Site Object contains setting for crawler.
Task Interface for identifying different tasks.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.downloader.selenium
Class and Description
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Request Object contains url to crawl. It contains some additional information.
Task Interface for identifying different tasks.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.example
Class and Description
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Site Object contains setting for crawler.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.handler
Class and Description
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Request Object contains url to crawl. It contains some additional information.
ResultItems Object contains extract results. It is contained in Page and will be processed in pipeline.
Site Object contains setting for crawler.
Task Interface for identifying different tasks.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.model
Class and Description
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Site Object contains setting for crawler.
Spider Entrance of a crawler. A spider contains four modules: Downloader, Scheduler, PageProcessor and Pipeline. Every module is a field of Spider.
Task Interface for identifying different tasks.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.model.samples
Class and Description
MultiPageModel Extract an object of more than one pages, such as news and articles.
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.monitor
Class and Description
Request Object contains url to crawl. It contains some additional information.
Spider Entrance of a crawler. A spider contains four modules: Downloader, Scheduler, PageProcessor and Pipeline. Every module is a field of Spider.
SpiderListener Listener of Spider on page processing.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.pipeline
Class and Description
ResultItems Object contains extract results. It is contained in Page and will be processed in pipeline.
Task Interface for identifying different tasks.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.processor
Class and Description
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Site Object contains setting for crawler.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.processor.example
Class and Description
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Site Object contains setting for crawler.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.proxy
Class and Description
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Task Interface for identifying different tasks.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.samples
Class and Description
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Site Object contains setting for crawler.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.samples.pipeline
Class and Description
ResultItems Object contains extract results. It is contained in Page and will be processed in pipeline.
Task Interface for identifying different tasks.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.samples.scheduler
Class and Description
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Request Object contains url to crawl. It contains some additional information.
Site Object contains setting for crawler.
Task Interface for identifying different tasks.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.scheduler
Class and Description
Request Object contains url to crawl. It contains some additional information.
Task Interface for identifying different tasks.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.scheduler.component
Class and Description
Request Object contains url to crawl. It contains some additional information.
Task Interface for identifying different tasks.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.scripts
Class and Description
Page Object storing extracted result and urls to fetch. Not thread safe. Main method： `Page.getUrl()` get url of current page `Page.getHtml()` get content of current page `Page.putField(String, Object)` save extracted result `Page.getResultItems()` get extract results to be used in `Pipeline` `Page.addTargetRequests(java.util.List)` `Page.addTargetRequest(String)` add urls to fetch
Site Object contains setting for crawler.

Classes in us.codecraft.webmagic used by us.codecraft.webmagic.utils
Class and Description
Request Object contains url to crawl. It contains some additional information.

Uses of Packageus.codecraft.webmagic

Uses of Package
us.codecraft.webmagic