Skip to content

Commit

Permalink
fix(plugins): fix plugins 路径导入错误
Browse files Browse the repository at this point in the history
  • Loading branch information
KonghaYao committed Jul 20, 2021
1 parent d1731c7 commit 05ba4a7
Show file tree
Hide file tree
Showing 12 changed files with 70 additions and 83 deletions.
28 changes: 0 additions & 28 deletions DEMO.js

This file was deleted.

30 changes: 10 additions & 20 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,32 +1,22 @@
# JSpider 3 BETA
# JSpider 3.2 BETA

[![](https://data.jsdelivr.com/v1/package/npm/js-spider/badge)](https://www.jsdelivr.com/package/npm/js-spider) ![npm](https://img.shields.io/npm/v/js-spider?style=flat-square) ![NPM](https://img.shields.io/npm/l/js-spider?style=flat-square) ![GitHub top language](https://img.shields.io/github/languages/top/konghayao/jspider) ![GitHub code size in bytes](https://img.shields.io/github/languages/code-size/konghayao/jspider) ![Website](https://img.shields.io/website?style=flat-square&up_color=green&up_message=online&url=http%3A%2F%2Fdongzhongzhidong.gitee.io%2Fjspider%2F) [![](https://gitee.com/dongzhongzhidong/jspider/badge/star.svg?theme=white)](https://gitee.com/dongzhongzhidong/jspider/)
[![](https://data.jsdelivr.com/v1/package/npm/js-spider/badge)](https://www.jsdelivr.com/package/npm/js-spider) ![npm](https://img.shields.io/npm/v/js-spider?style=flat-square) ![NPM](https://img.shields.io/npm/l/js-spider?style=flat-square) ![GitHub top language](https://img.shields.io/github/languages/top/konghayao/jspider) ![GitHub code size in bytes](https://img.shields.io/github/languages/code-size/konghayao/jspider) [![](https://gitee.com/dongzhongzhidong/jspider/badge/star.svg?theme=white)](https://gitee.com/dongzhongzhidong/jspider/)

> JSpider 3 是在 Chrome Devtools 中进行爬虫的爬虫框架,这个框架包括了完整的爬虫支持。如果您具有前端基础,那么可以在三分钟内入门哦!
[官方教程链接](http://dongzhongzhidong.gitee.io/jspider/)

## 快速入门
> JSpider 3 is a Chrome DevTools crawler framework that includes full crawler support. If you have a front-end foundation, you can get up and running in three minutes!
### 极速爬取
- **高效率工具:JSpider 自带并发控制,提供多种方便的数据处理插件。**
- **爬虫高度复用:JSpider 的代码可以重复使用,随时添加新任务。**

只有简单的几行,适用于极速操作,这会直接将这些 URL 中的内容下载到本地。

> 右键 -> 检查,打开浏览器 Devtools,在 Console 中即可使用这些代码哦!
[官方教程链接](http://dongzhongzhidong.gitee.io/jspider/)

```js
import('https://cdn.jsdelivr.net/npm/js-spider/dist/JSpider.esm.min.js').then({JSpider}=>{
window.JSpider = JSpider;
});// 从 jsDelivr 导入代码
// 放入您的 URL
JSpider.simpleCrawl(["fake/excel","fake/excel"]);
// 等待文件下载完成!
```
## 快速入门

### 更加高级的自定义爬取
### 自定义爬取

```js
import('https://cdn.jsdelivr.net/npm/js-spider/dist/JSpider.esm.min.js').then({JSpider}=>{
await import('https://cdn.jsdelivr.net/npm/js-spider/dist/JSpider.esm.min.js').then({JSpider}=>{
window.JSpider = JSpider;
});

Expand All @@ -36,7 +26,7 @@ const {
Download, // 下载库
} = JSpider.plugins;

let urls = ['']// 您的爬取路径数组
let urls = ['https://.....']// 您的爬取路径数组

const spider = new JSpider(
Request(),
Expand Down
3 changes: 1 addition & 2 deletions VERSION.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,12 +18,11 @@

- 重构所有的核心代码

- [ ] Control Panel 暂停功能
- [x] Control Panel 暂停功能

- [ ] Control Panel 重试功能(非流内自动重试)

- 重构 plugins
- [ ] plugins 分包导入
- [ ] Request 多平台

> 因为是 3.2 版本才更新这个文件,所以以前的版本都丢失了。
37 changes: 25 additions & 12 deletions dist/JSpider.esm.min.js
Original file line number Diff line number Diff line change
Expand Up @@ -3718,26 +3718,41 @@ var pick = flatRest(function(object, paths) {
* SPDX-License-Identifier: Apache-2.0
*/

/**
* EventHub 是一个事件处理中心,用于事件的接收与派发
*/

class EventHub {
all = new Map();
constructor(eventMap = {}, bindThis = null) {
this.bindThis = bindThis || globalThis;
this.on(eventMap);

// 创建一个 rxjs 流源头
// createSource$ 创建一个 rxjs 流的源头监听相应的事件
this.createSource$ = memoize((eventName) => {
return fromEventPattern(
(handle) => this.on(eventName, handle),
(handle) => this.off(eventName, handle),
);
});
}

/**
* #on 是单个事件绑定函数,type 与 handle 函数一一对应
*/
#on(type, handler) {
const handlers = this.all.get(type);
// ! 注意,栈的结构,这里要使用 unshift 将元素插入到头部,这样触发的时候才会最后执行最先声明的函数作为默认函数
// 栈的结构可以保证 在 destroy 事件的时候,首先定义的 destroy 可以最后执行,保证后面绑定 destroy 事件的函数可以先触发,而在 destroy 的定义函数中可以最后 off('*') 解除事件
handlers ? handlers.unshift(handler) : this.all.set(type, [handler]);
}

/**
* on 函数重载,第一个参数可以为一个事件绑定对象,
* on({eventName: callback })
* on({eventName: [callback] })
* on(type,handle)
*/
on(type, handler) {
// 函数重载
if (typeof type === 'string') {
Expand All @@ -3747,12 +3762,16 @@ class EventHub {
Object.entries(type).forEach(([key, value]) => {
if (value instanceof Array) {
value.forEach((item) => this.#on(key, item));
} else {
} else if (value instanceof Function) {
this.#on(key, value);
}
});
}
}

/**
* off 函数 type 设置为 '*' 时删除所有函数
*/
off(type, handler) {
if (type === '*') {
return this.all.clear();
Expand All @@ -3773,11 +3792,6 @@ class EventHub {
})
: [];
}

operators = {
// TODO EventHub 中对于 rxjs 流的支持
EmitWhen(config) {},
};
}

/**
Expand Down Expand Up @@ -4067,14 +4081,13 @@ function createUUID(string) {
* Copyright 2021 KonghaYao 江夏尧 <dongzhongzhidong@qq.com>
* SPDX-License-Identifier: Apache-2.0
*/

/**
* 函数用途描述
* 这个是用于 async 函数队列 连续执行的函数,只要 enQueue 之后就会连续执行,直至停止
* 这个是用于 async 函数队列 连续执行的函数,只要 enQueue 之后就会连续执行,直至完成
*/

class functionQueue {
QueuePromise = Promise.resolve();
constructor() {}
enQueue(...args) {
this.QueuePromise = args.reduce((promise, current) => {
return promise.then(current);
Expand Down Expand Up @@ -4479,7 +4492,7 @@ class ControlPanel$1 {
startFlow() {
this.$EventHub.emit('Flow:start');
}
// TODO 测试暂停功能

stopFlow() {
this.$EventHub.emit('Flow:stop');
}
Expand Down Expand Up @@ -6256,7 +6269,7 @@ var index = Object.assign(Spider, tools, {
Task,
TaskGroup,
version: "3.1.8",
buildDate: new Date(1626743574108),
buildDate: new Date(1626782541521),
});

export default index;
37 changes: 25 additions & 12 deletions dist/JSpider.min.js
Original file line number Diff line number Diff line change
Expand Up @@ -3721,26 +3721,41 @@ var JSpider = (function () {
* SPDX-License-Identifier: Apache-2.0
*/

/**
* EventHub 是一个事件处理中心,用于事件的接收与派发
*/

class EventHub {
all = new Map();
constructor(eventMap = {}, bindThis = null) {
this.bindThis = bindThis || globalThis;
this.on(eventMap);

// 创建一个 rxjs 流源头
// createSource$ 创建一个 rxjs 流的源头监听相应的事件
this.createSource$ = memoize((eventName) => {
return fromEventPattern(
(handle) => this.on(eventName, handle),
(handle) => this.off(eventName, handle),
);
});
}

/**
* #on 是单个事件绑定函数,type 与 handle 函数一一对应
*/
#on(type, handler) {
const handlers = this.all.get(type);
// ! 注意,栈的结构,这里要使用 unshift 将元素插入到头部,这样触发的时候才会最后执行最先声明的函数作为默认函数
// 栈的结构可以保证 在 destroy 事件的时候,首先定义的 destroy 可以最后执行,保证后面绑定 destroy 事件的函数可以先触发,而在 destroy 的定义函数中可以最后 off('*') 解除事件
handlers ? handlers.unshift(handler) : this.all.set(type, [handler]);
}

/**
* on 函数重载,第一个参数可以为一个事件绑定对象,
* on({eventName: callback })
* on({eventName: [callback] })
* on(type,handle)
*/
on(type, handler) {
// 函数重载
if (typeof type === 'string') {
Expand All @@ -3750,12 +3765,16 @@ var JSpider = (function () {
Object.entries(type).forEach(([key, value]) => {
if (value instanceof Array) {
value.forEach((item) => this.#on(key, item));
} else {
} else if (value instanceof Function) {
this.#on(key, value);
}
});
}
}

/**
* off 函数 type 设置为 '*' 时删除所有函数
*/
off(type, handler) {
if (type === '*') {
return this.all.clear();
Expand All @@ -3776,11 +3795,6 @@ var JSpider = (function () {
})
: [];
}

operators = {
// TODO EventHub 中对于 rxjs 流的支持
EmitWhen(config) {},
};
}

/**
Expand Down Expand Up @@ -4070,14 +4084,13 @@ var JSpider = (function () {
* Copyright 2021 KonghaYao 江夏尧 <dongzhongzhidong@qq.com>
* SPDX-License-Identifier: Apache-2.0
*/

/**
* 函数用途描述
* 这个是用于 async 函数队列 连续执行的函数,只要 enQueue 之后就会连续执行,直至停止
* 这个是用于 async 函数队列 连续执行的函数,只要 enQueue 之后就会连续执行,直至完成
*/

class functionQueue {
QueuePromise = Promise.resolve();
constructor() {}
enQueue(...args) {
this.QueuePromise = args.reduce((promise, current) => {
return promise.then(current);
Expand Down Expand Up @@ -4482,7 +4495,7 @@ var JSpider = (function () {
startFlow() {
this.$EventHub.emit('Flow:start');
}
// TODO 测试暂停功能

stopFlow() {
this.$EventHub.emit('Flow:stop');
}
Expand Down Expand Up @@ -6259,7 +6272,7 @@ var JSpider = (function () {
Task,
TaskGroup,
version: "3.1.8",
buildDate: new Date(1626743574108),
buildDate: new Date(1626782541521),
});

return index;
Expand Down
2 changes: 1 addition & 1 deletion plugins/Download.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@
*/
import { toFile } from './utils/toFile.js';

import { Plugin } from '../Pipeline/PluginSystem.js';
import { Plugin } from '../src/Pipeline/PluginSystem.js';
// 在 浏览器中下载是不能够同时进行的,也就是说,如果前面的没有下载完,后面的又提交
// 会导致后面的全部失效,所以设置 Promise 下载队列
const DownloadQueue = {
Expand Down
2 changes: 1 addition & 1 deletion plugins/ExcelHelper.js
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@
* SPDX-License-Identifier: Apache-2.0
*/
import { createExcelFile } from './ExcelHelper/createExcelFile.js';
import { Plugin } from '../Pipeline/PluginSystem.js';
import { Plugin } from '../src/Pipeline/PluginSystem.js';
import { init } from './ExcelHelper/xlsx.js';
// 未完成 导入 XLSX 的 Promise 到流的转变

Expand Down
2 changes: 1 addition & 1 deletion plugins/ExcelHelper/xlsx.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
* Copyright 2021 KonghaYao 江夏尧 <dongzhongzhidong@qq.com>
* SPDX-License-Identifier: Apache-2.0
*/
import { $load } from '../../tools/loader/loader.js';
import { $load } from '../../src/tools/loader/loader.js';

let XLSX;
function init() {
Expand Down
2 changes: 1 addition & 1 deletion plugins/JSzip/JSzip.js
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@
* Copyright 2021 KonghaYao 江夏尧 <dongzhongzhidong@qq.com>
* SPDX-License-Identifier: Apache-2.0
*/
import { $load } from '../../tools/loader/loader.js';
import { $load } from '../../src/tools/loader/loader.js';

let JSZip;
function init() {
Expand Down
4 changes: 2 additions & 2 deletions plugins/Request.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@
*/
/* eslint-disable no-invalid-this */

import { Plugin } from '../Pipeline/PluginSystem.js';
import { concurrent } from '../utils/concurrent.js';
import { Plugin } from '../src/Pipeline/PluginSystem.js';
import { concurrent } from '../src/utils/concurrent.js';

// ! 这个 Request 文件是标准的 Plugin 的高级注册示例

Expand Down
4 changes: 2 additions & 2 deletions plugins/zipFile.js
Original file line number Diff line number Diff line change
Expand Up @@ -5,13 +5,13 @@
*/
import { bufferTime, concatMap, filter } from 'rxjs/operators';

import { Plugin } from '../Pipeline/PluginSystem.js';
import { Plugin } from '../src/Pipeline/PluginSystem.js';

import { init } from './JSzip/JSzip.js';
import { zipper } from './JSzip/zipper.js';

import { toFile } from './utils/toFile.js';
import { TaskGroup } from '../TaskSystem/TaskGroup.js';
import { TaskGroup } from '../src/TaskSystem/TaskGroup.js';

export const ZipFile = function (options = {}) {
if (!options.zipFileName) options.zipFileName = new Date().getTime();
Expand Down
2 changes: 1 addition & 1 deletion test/Request.js
Original file line number Diff line number Diff line change
Expand Up @@ -33,5 +33,5 @@ export async function main() {
// Download(),
);
spider.crawl(urls);
window.spider = spider;
spider.start();
}

0 comments on commit 05ba4a7

Please sign in to comment.