Extract an object of more than one pages, such as news and articles.
Listener of Spider on page processing.
Interface for identifying different tasks.
Object storing extracted result and urls to fetch.
Not thread safe.
Object contains url to crawl.
It contains some additional information.
Object contains extract results.
It is contained in Page and will be processed in pipeline.
Object contains setting for crawler.
Entrance of a crawler.
A spider contains four modules: Downloader, Scheduler, PageProcessor and Pipeline.
Every module is a field of Spider.
Copyright © 2017. All rights reserved.