Package | Description |
---|---|
us.codecraft.webmagic |
Main class "Spider" and models.
|
us.codecraft.webmagic.configurable | |
us.codecraft.webmagic.downloader |
Downloader is the part that downloads web pages and store in Page object.
|
us.codecraft.webmagic.selector |
Selectors for page extraction.
|
us.codecraft.webmagic.utils |
Static utils of webmagic.
|
Class and Description |
---|
Html
Selectable html.
|
Json
parse json
|
Selectable
Selectable text.
|
Class and Description |
---|
Selector
Selector(extractor) for text.
|
Class and Description |
---|
Html
Selectable html.
|
Class and Description |
---|
AbstractSelectable |
AndSelector
All selectors will be arranged as a pipeline.
|
BaseElementSelector |
CssSelector
CSS selector.
|
ElementSelector
Selector(extractor) for html elements.
|
Html
Selectable html.
|
HtmlNode |
Json
parse json
|
OrSelector
All extractors will do extracting separately,
and the results of extractors will combined as the final result. |
PlainText
Selectable plain text.
Can not be selected by XPath or CSS Selector. |
RegexSelector
Selector in regex.
|
Selectable
Selectable text.
|
Selector
Selector(extractor) for text.
|
SmartContentSelector
Borrowed from https://code.google.com/p/cx-extractor/
|
XpathSelector
XPath selector based on Xsoup.
|
Class and Description |
---|
Selector
Selector(extractor) for text.
|
Copyright © 2017. All rights reserved.