Skip to content

Commit

Permalink
Updated documentations
Browse files Browse the repository at this point in the history
  • Loading branch information
DavidBelicza committed Dec 29, 2023
1 parent b8a630c commit d63788f
Show file tree
Hide file tree
Showing 17 changed files with 53 additions and 48 deletions.
47 changes: 26 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,39 +18,42 @@ TextRank
</p>

<p align="center">
This source code is an implementation of the TextRank algorithm (Automatic summarization) on PHP7 strict mode. It can summarize a text, article for example to a short paragraph. Before it would start the summarizing it removes the junk words what are defined in the Stopwords namespace. It is possible to extend it with another languages.
<br />
This source code is an implementation of TextRank algorithm in PHP programming language, under MIT licence.<br />
<br />
</p>

## TextRank or Automatic summarization
# TextRank vs. ChatGPT
GPTs like ChatGPT are supervised language models that understand the context and generate new content from the given
input using vast resources while TextRank is a cost-efficient/low-cost text extraction algorithm. TextRank algorithm
also can be used as a pre-processor to a GPT model to reduce the text size to save on resource consumption.

# TextRank or Automatic summarization
> Automatic summarization is the process of reducing a text document with a computer program in order to create a summary that retains the most important points of the original document. Technologies that can make a coherent summary take into account variables such as length, writing style and syntax. Automatic data summarization is part of machine learning and data mining. The main idea of summarization is to find a representative subset of the data, which contains the information of the entire set. Summarization technologies are used in a large number of sectors in industry today. - Wikipedia
The algorithm of this implementation is:
* Find sentences,
* Remove stopwords,
* Create integer values by find and count the matching words,
* Change the integer values by the related words' integer values,
* Normalize values to create scores,
* Order by scores

## Install
* Extracts sentences,
* Removes stopwords,
* Adds integer values to words by finding and counting the matching words,
* Weights the values of the words,
* Normalizes values to get the scores,
* Sorts by scores

# Install to use it in your project
```
cd your-project-folder
composer require php-science/textrank
```

## Test
# Install for contributing
```
cd project-folder
cd git-project-folder
docker-compose build
docker-compose up -d
composer install
composer test
```
or
```
cd project-folder
phpunit --colors='always' $(pwd)/tests
```

## Examples
# Examples
```php

use PhpScience\TextRank\Tool\StopWords\English;
Expand All @@ -73,10 +76,10 @@ $result = $api->getHighlights($text);
$result = $api->summarizeTextBasic($text);
```
More examples:
* [tests/TextRankFacadeTest.php](https://github.com/DoveID/PHP-Science-TextRank/blob/master/tests/TextRankFacadeTest.php)
* [tests/TextRankFacadeTest.php](https://github.com/DavidBelicza/PHP-Science-TextRank/blob/master/tests/TextRankFacadeTest.php)
* https://php.science

## Authors, Contributors
# Authors, Contributors

Name | GitHub user
--- | ---
Expand All @@ -89,3 +92,5 @@ Andrey Astashov | @mvcaaa
Leo Toneff | @bragle
Willy Arisky | @willyarisky
Robert-Jan Keizer | @KeizerDev
Morty | @evil1morty
Sezer Fidancı | @SezerFidanci
4 changes: 2 additions & 2 deletions phpunit.xml
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author David Belicza <87.bdavid@gmail.com>
* @author David Belicza <david@belicza.com>
*/
-->
<phpunit xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
Expand Down
4 changes: 2 additions & 2 deletions src/TextRankFacade.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author David Belicza <87.bdavid@gmail.com>
* @author David Belicza <david@belicza.com>
*/

declare(strict_types=1);
Expand Down
4 changes: 2 additions & 2 deletions src/Tool/Graph.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author David Belicza <87.bdavid@gmail.com>
* @author David Belicza <david@belicza.com>
*/

declare(strict_types=1);
Expand Down
4 changes: 2 additions & 2 deletions src/Tool/Parser.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author David Belicza <87.bdavid@gmail.com>
* @author David Belicza <david@belicza.com>
*/

declare(strict_types=1);
Expand Down
4 changes: 2 additions & 2 deletions src/Tool/Score.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author David Belicza <87.bdavid@gmail.com>
* @author David Belicza <david@belicza.com>
*/

declare(strict_types=1);
Expand Down
4 changes: 2 additions & 2 deletions src/Tool/StopWords/English.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author David Belicza <87.bdavid@gmail.com>
* @author David Belicza <david@belicza.com>
*/

declare(strict_types=1);
Expand Down
2 changes: 1 addition & 1 deletion src/Tool/StopWords/French.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author Syndesi <github.com/Syndesi>
*/
Expand Down
2 changes: 1 addition & 1 deletion src/Tool/StopWords/German.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author Syndesi <github.com/Syndesi>
*/
Expand Down
2 changes: 1 addition & 1 deletion src/Tool/StopWords/Norwegian.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author Syndesi <github.com/Syndesi>
*/
Expand Down
4 changes: 2 additions & 2 deletions src/Tool/StopWords/Russian.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author David Belicza <87.bdavid@gmail.com>
* @author David Belicza <david@belicza.com>
* @author Andrey Astashov <mvc.aaa@gmail.com> (Russian StopWords)
*/

Expand Down
2 changes: 1 addition & 1 deletion src/Tool/StopWords/Spanish.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author Syndesi <github.com/Syndesi>
*/
Expand Down
4 changes: 2 additions & 2 deletions src/Tool/StopWords/StopWordsAbstract.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author David Belicza <87.bdavid@gmail.com>
* @author David Belicza <david@belicza.com>
*/

declare(strict_types=1);
Expand Down
2 changes: 1 addition & 1 deletion src/Tool/StopWords/Turkish.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author Sezer Fidancı <github.com/SezerFidanci>
*/
Expand Down
4 changes: 2 additions & 2 deletions src/Tool/Summarize.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author David Belicza <87.bdavid@gmail.com>
* @author David Belicza <david@belicza.com>
*/

declare(strict_types=1);
Expand Down
4 changes: 2 additions & 2 deletions src/Tool/Text.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author David Belicza <87.bdavid@gmail.com>
* @author David Belicza <david@belicza.com>
*/

declare(strict_types=1);
Expand Down
4 changes: 2 additions & 2 deletions tests/TextRankFacadeTest.php
Original file line number Diff line number Diff line change
Expand Up @@ -2,9 +2,9 @@
/**
* PHP Science TextRank (http://php.science/)
*
* @see https://github.com/doveid/php-science-textrank
* @see https://github.com/DavidBelicza/PHP-Science-TextRank
* @license https://opensource.org/licenses/MIT the MIT License
* @author David Belicza <87.bdavid@gmail.com>
* @author David Belicza <david@belicza.com>
*/

declare(strict_types=1);
Expand Down

0 comments on commit d63788f

Please sign in to comment.