-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
c80ceed
commit 1394e8f
Showing
14 changed files
with
598 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,93 @@ | ||
# bibextract | ||
Extract bibliography entries from LaTeX sources | ||
Extract bibliography entries from a reference BibTeX file using citations from | ||
LaTeX sources, to create project-specific BibTeX files. | ||
|
||
> For now, in order to work properly, both `bibx` and `ext.sh` **NEED** to be | ||
> in the same directory. This is expected to change in a future release. | ||
## Dependencies | ||
* Python >= 3.5 | ||
* [`python-bibtexparser`](https://github.com/sciunto-org/python-bibtexparser) | ||
* [Install](https://bibtexparser.readthedocs.io/en/master/install.html#how-to-install) | ||
|
||
## Installation | ||
Once you have installed the dependencies, clone this repository | ||
```sh | ||
git clone https://github.com/CodePurble/bibextract.git | ||
``` | ||
|
||
Run `install.sh`. The files will be installed to `/usr/local/bin`. This can be | ||
changed by modifying the `PREFIX` variable in `install.sh`. | ||
```sh | ||
./install.sh | ||
``` | ||
|
||
> **NOTE**: To uninstall, use the provided `uninstall.sh` script. If you edited | ||
> `PREFIX` while installing, make sure to update `uninstall.sh` also with the | ||
> same value before running. | ||
### Updating an existing install | ||
First, navigate to where you initially cloned this repo and pull down the latest changes: | ||
```sh | ||
cd path/to/repo | ||
git pull origin master | ||
``` | ||
|
||
Then, rerun `install.sh` | ||
```sh | ||
./install.sh | ||
``` | ||
|
||
## Usage | ||
Check the help-text: | ||
```sh | ||
bibx -h | ||
``` | ||
|
||
Also read the manpage: | ||
```sh | ||
man bibx | ||
``` | ||
|
||
### Example | ||
Assume that there are two LaTeX files: `a.tex` and `b.tex`, the reference | ||
BibTeX file is `g.bib` and we want the output file to be `refs.bib`. The | ||
command to achieve this would be: | ||
```sh | ||
bibx -b g.bib -o refs.bib a.tex b.tex | ||
|
||
# A variation | ||
bibx -b g.bib -o refs.bib *.tex | ||
``` | ||
|
||
## Bugs | ||
If you find a bug or are facing some problem, please [open an | ||
issue](https://github.com/CodePurble/bibextract/issues/new/choose). | ||
|
||
## Contributing | ||
Contributions are welcome: features, documentation, whatever! Fork this repo | ||
and create a new branch for your changes, please do not commit directly to the | ||
master branch. Then create a pull-request. | ||
|
||
Take a look at the [TODO](./TOOD.md) or | ||
[issues](https://github.com/CodePurble/bibextract/issues) for some inspiration! | ||
|
||
If you are updating the manpage, you will need | ||
[`pandoc`](https://pandoc.org/index.html) to generate it from | ||
[`bibx.1.md`](./bibx.1.md). Use the following command to generate the manpage: | ||
```sh | ||
pandoc --standalone --to man bibx.1.md -o bibx.1 | ||
``` | ||
|
||
### Tests | ||
If contributing to the code, please run the following test after your edits: | ||
``` | ||
./bibx -b test/global.bib -o test/ext.bib test/*.tex | ||
diff test/ext.bib test/ext-golden.bib | ||
``` | ||
The `diff` command must return **NOTHING**. This means that the output is as | ||
expected for the given inputs. | ||
|
||
## License | ||
Free use of this software is granted under the terms of the [GPLv3 | ||
License](https://github.com/CodePurble/bibextract/blob/master/LICENSE). |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
- [ ] Remove dependency on `ext.sh` | ||
- [ ] Support multiple reference BibTeX files | ||
- [ ] Support a directory of reference BibTeX files |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,103 @@ | ||
#!/usr/bin/env python3 | ||
|
||
import argparse | ||
import bibtexparser | ||
import subprocess | ||
import shlex | ||
from bibtexparser.bparser import BibTexParser, BibDatabase | ||
from bibtexparser.bwriter import BibTexWriter | ||
|
||
def main(): | ||
parser = argparse.ArgumentParser( | ||
description="""Extract bibliography entries from LaTeX sources | ||
using a reference BibTeX file""", | ||
epilog="""For reporting bugs, giving suggestions and contributing | ||
to this project, visit https://github.com/CodePurble/bibextract""" | ||
) | ||
parser.add_argument("-b", | ||
"--bib", | ||
required=True, | ||
help="BibTeX file to look for entries in" | ||
) | ||
parser.add_argument("-o", | ||
"--output", | ||
default="ext.bib", | ||
help="""File to which the extracted entries are output | ||
to. Default is './ext.bib'""", | ||
) | ||
parser.add_argument("-e", | ||
"--entries", | ||
help="""Text file containing entries to extract from | ||
the BibTeX file, each line in the file must have ONE | ||
BibTeX label""" | ||
) | ||
parser.add_argument("-q", | ||
"--quiet", | ||
default=False, | ||
help="Suppress output to stdout", | ||
action="store_true" | ||
) | ||
parser.add_argument("files", | ||
help="""LaTeX files to scan for citations whose BibTeX | ||
entries need to be extract""", | ||
action="extend", | ||
nargs="+" | ||
) | ||
parser.add_argument("-i", | ||
"--indent", | ||
help="""Indentation to be used in the generated BibTeX | ||
file. Default is four spaces""", | ||
default=" ", | ||
) | ||
|
||
args = parser.parse_args() | ||
|
||
# Scan LaTeX files for citations and generate set containing BibTeX entry | ||
# labels | ||
e_set = set() | ||
for file in args.files: | ||
output = subprocess.run(['./ext.sh', shlex.quote(file)], | ||
capture_output=True | ||
) | ||
if(output.returncode == 0): | ||
e_set = e_set.union(set(output.stdout.decode().rstrip().split('\n'))) | ||
|
||
# Merge custom BibTeX entry labels into main set if any | ||
if(args.entries is not None): | ||
with open(args.entries, 'r') as en: | ||
while(entry := en.readline()): | ||
e_set = e_set.union(entry) | ||
|
||
parser = BibTexParser() | ||
parser.ignore_nonstandard_types = False | ||
with open(args.bib) as bibtex_file: | ||
bib_database = bibtexparser.load(bibtex_file, parser) | ||
|
||
e_dict = bib_database.entries_dict | ||
out_db = BibDatabase() | ||
|
||
entrycount = 0 | ||
found = 0 | ||
for entry in e_set: | ||
entrycount += 1 | ||
if entry in e_dict.keys(): | ||
if not args.quiet: | ||
print(f"Found: {entry}") | ||
found += 1 | ||
out_db.entries.append(e_dict[entry]) | ||
else: | ||
if not args.quiet: | ||
print(f"Not found: {entry}") | ||
|
||
|
||
writer = BibTexWriter() | ||
writer.indent = args.indent | ||
writer.add_trailing_comma = True | ||
with open(args.output, 'w') as outfile: | ||
bibtexparser.dump(out_db, outfile, writer) | ||
|
||
if not args.quiet: | ||
print(f"\nFound {found}/{entrycount} entries") | ||
|
||
if __name__ == "__main__": | ||
main() |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,128 @@ | ||
.\" Automatically generated by Pandoc 2.19.2 | ||
.\" | ||
.\" Define V font for inline verbatim, using C font in formats | ||
.\" that render this, and otherwise B font. | ||
.ie "\f[CB]x\f[]"x" \{\ | ||
. ftr V B | ||
. ftr VI BI | ||
. ftr VB B | ||
. ftr VBI BI | ||
.\} | ||
.el \{\ | ||
. ftr V CR | ||
. ftr VI CI | ||
. ftr VB CB | ||
. ftr VBI CBI | ||
.\} | ||
.TH "BIBX" "1" "" "Version 1.0" "bibx Command Documentation" | ||
.hy | ||
.SH NAME | ||
.PP | ||
bibx \[em] Extract bibliography entries from a reference BibTeX file | ||
using citations from LaTeX sources, to create project-specific BibTeX | ||
files. | ||
.SH SYNOPSIS | ||
.PP | ||
bibx [-h] -b BIB [-o OUTPUT] [-e ENTRIES] [-q] [-i INDENT] files [files | ||
\&...] | ||
.SH DESCRIPTION | ||
.PP | ||
This program extracts citations made in LaTeX source files and generates | ||
a BibTeX file with entries that are found. | ||
In order to generate the BibTeX file, a single \[lq]global\[rq] BibTeX | ||
file that acts as a database to look for matching entries. | ||
Output is written to `ext.bib' by default. | ||
Requires at least one LaTeX source. | ||
.PP | ||
If extra entries other than those found from the LaTeX sources are to be | ||
added to the output file, it can be done by using the -e option. | ||
The file must contain a single BibTeX label on each line. | ||
.PP | ||
For example: | ||
.IP | ||
.nf | ||
\f[C] | ||
label1 | ||
label2 | ||
label3 | ||
\f[R] | ||
.fi | ||
.SH OPTIONS | ||
.PP | ||
-h, \[en]help | ||
.IP | ||
.nf | ||
\f[C] | ||
Show help and exit | ||
\f[R] | ||
.fi | ||
.PP | ||
-b BIB, \[en]bib BIB | ||
.IP | ||
.nf | ||
\f[C] | ||
Specify global BibTeX file to look for entries in | ||
\f[R] | ||
.fi | ||
.PP | ||
-o OUTPUT, \[en]output OUTPUT | ||
.IP | ||
.nf | ||
\f[C] | ||
Specify file to write the BibTeX output to. Defaults to \[aq]./ext.bib\[aq]. | ||
\f[R] | ||
.fi | ||
.PP | ||
-e ENTRIES, \[en]entries ENTRIES | ||
.IP | ||
.nf | ||
\f[C] | ||
Specify file containing extra entries names to be searched and included in | ||
the final output. | ||
\f[R] | ||
.fi | ||
.PP | ||
-i INDENT, \[en]indent INDENT | ||
.IP | ||
.nf | ||
\f[C] | ||
Specify the character(s) to be used to indent the BibTeX entries in the | ||
final output. Defaults to four spaces. Example: -i \[dq] \[dq] (use two spaces). | ||
\f[R] | ||
.fi | ||
.PP | ||
-q, \[en]quiet | ||
.IP | ||
.nf | ||
\f[C] | ||
Suppress output to stdout | ||
\f[R] | ||
.fi | ||
.SH EXAMPLE | ||
.PP | ||
Assume that there are two LaTeX files: \f[V]a.tex\f[R] and | ||
\f[V]b.tex\f[R], the reference BibTeX file is \f[V]g.bib\f[R] and we | ||
want the output file to be \f[V]refs.bib\f[R]. | ||
.PP | ||
The command to achieve this would be: | ||
.IP | ||
.nf | ||
\f[C] | ||
bibx -b g.bib -o refs.bib a.tex b.tex | ||
\f[R] | ||
.fi | ||
.SH BUGS | ||
.PP | ||
See GitHub issues: https://github.com/CodePurble/bibextract/issues | ||
.SH AUTHOR | ||
.PP | ||
Ramprakash C (https://github.com/CodePurble) | ||
.SH COPYRIGHT | ||
.PP | ||
Free use of this software is granted under the terms of the GPLv3 | ||
License. | ||
.PP | ||
See: https://github.com/CodePurble/bibextract/blob/master/LICENSE | ||
.SH SEE ALSO | ||
.PP | ||
bibtex(1), latex(1) |
Oops, something went wrong.