@democracy-deutschland/scapacra
Scapacra (Scraper Parser Crawler)
Last updated 7 months ago by dornhoeschen .
Apache-2.0 · Repository · Bugs · Original npm · Tarball · package.json
$ cnpm install @democracy-deutschland/scapacra 
SYNC missed versions from official npm registry.

scapacra

Introduction

Scapactra (scraper, parser and crawler) is a framework to extract data from different data sources. The idea for scapactra bases on the ETL (extract, transform and load) process (ETL) and defines an modular design pattern providing a basic ETL workflow.

The framework is structured into three basic modules.

  1. Parser: The parser extracts the data from a defined document.
  2. Browser: The browser navigates through a structure and retrieves the desired fragments for the parser.
  3. Scraper: A scraper executes the browsers an parsers and providing their results over an centralized interface.

Parser

Parser

Browser

Browser

Scraper

Scraper

Current Tags

  • 1.0.6                                ...           latest (7 months ago)

5 Versions

  • 1.0.6                                ...           7 months ago
  • 1.0.5                                ...           7 months ago
  • 1.0.4                                ...           8 months ago
  • 1.0.3                                ...           a year ago
  • 1.0.2                                ...           a year ago
Downloads
Today 0
This Week 0
This Month 5
Last Day 0
Last Week 5
Last Month 0
Dependencies (6)
Dev Dependencies (8)

Copyright 2014 - 2016 © taobao.org |