Skip to content

Commit

Permalink
Initial release (v0.1)
Browse files Browse the repository at this point in the history
  • Loading branch information
CodePurble committed Oct 8, 2022
1 parent c80ceed commit 1394e8f
Show file tree
Hide file tree
Showing 14 changed files with 598 additions and 1 deletion.
93 changes: 92 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,93 @@
# bibextract
Extract bibliography entries from LaTeX sources
Extract bibliography entries from a reference BibTeX file using citations from
LaTeX sources, to create project-specific BibTeX files.

> For now, in order to work properly, both `bibx` and `ext.sh` **NEED** to be
> in the same directory. This is expected to change in a future release.
## Dependencies
* Python >= 3.5
* [`python-bibtexparser`](https://github.com/sciunto-org/python-bibtexparser)
* [Install](https://bibtexparser.readthedocs.io/en/master/install.html#how-to-install)

## Installation
Once you have installed the dependencies, clone this repository
```sh
git clone https://github.com/CodePurble/bibextract.git
```

Run `install.sh`. The files will be installed to `/usr/local/bin`. This can be
changed by modifying the `PREFIX` variable in `install.sh`.
```sh
./install.sh
```

> **NOTE**: To uninstall, use the provided `uninstall.sh` script. If you edited
> `PREFIX` while installing, make sure to update `uninstall.sh` also with the
> same value before running.
### Updating an existing install
First, navigate to where you initially cloned this repo and pull down the latest changes:
```sh
cd path/to/repo
git pull origin master
```

Then, rerun `install.sh`
```sh
./install.sh
```

## Usage
Check the help-text:
```sh
bibx -h
```

Also read the manpage:
```sh
man bibx
```

### Example
Assume that there are two LaTeX files: `a.tex` and `b.tex`, the reference
BibTeX file is `g.bib` and we want the output file to be `refs.bib`. The
command to achieve this would be:
```sh
bibx -b g.bib -o refs.bib a.tex b.tex

# A variation
bibx -b g.bib -o refs.bib *.tex
```

## Bugs
If you find a bug or are facing some problem, please [open an
issue](https://github.com/CodePurble/bibextract/issues/new/choose).

## Contributing
Contributions are welcome: features, documentation, whatever! Fork this repo
and create a new branch for your changes, please do not commit directly to the
master branch. Then create a pull-request.

Take a look at the [TODO](./TOOD.md) or
[issues](https://github.com/CodePurble/bibextract/issues) for some inspiration!

If you are updating the manpage, you will need
[`pandoc`](https://pandoc.org/index.html) to generate it from
[`bibx.1.md`](./bibx.1.md). Use the following command to generate the manpage:
```sh
pandoc --standalone --to man bibx.1.md -o bibx.1
```

### Tests
If contributing to the code, please run the following test after your edits:
```
./bibx -b test/global.bib -o test/ext.bib test/*.tex
diff test/ext.bib test/ext-golden.bib
```
The `diff` command must return **NOTHING**. This means that the output is as
expected for the given inputs.

## License
Free use of this software is granted under the terms of the [GPLv3
License](https://github.com/CodePurble/bibextract/blob/master/LICENSE).
3 changes: 3 additions & 0 deletions TOOD.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
- [ ] Remove dependency on `ext.sh`
- [ ] Support multiple reference BibTeX files
- [ ] Support a directory of reference BibTeX files
103 changes: 103 additions & 0 deletions bibx
Original file line number Diff line number Diff line change
@@ -0,0 +1,103 @@
#!/usr/bin/env python3

import argparse
import bibtexparser
import subprocess
import shlex
from bibtexparser.bparser import BibTexParser, BibDatabase
from bibtexparser.bwriter import BibTexWriter

def main():
parser = argparse.ArgumentParser(
description="""Extract bibliography entries from LaTeX sources
using a reference BibTeX file""",
epilog="""For reporting bugs, giving suggestions and contributing
to this project, visit https://github.com/CodePurble/bibextract"""
)
parser.add_argument("-b",
"--bib",
required=True,
help="BibTeX file to look for entries in"
)
parser.add_argument("-o",
"--output",
default="ext.bib",
help="""File to which the extracted entries are output
to. Default is './ext.bib'""",
)
parser.add_argument("-e",
"--entries",
help="""Text file containing entries to extract from
the BibTeX file, each line in the file must have ONE
BibTeX label"""
)
parser.add_argument("-q",
"--quiet",
default=False,
help="Suppress output to stdout",
action="store_true"
)
parser.add_argument("files",
help="""LaTeX files to scan for citations whose BibTeX
entries need to be extract""",
action="extend",
nargs="+"
)
parser.add_argument("-i",
"--indent",
help="""Indentation to be used in the generated BibTeX
file. Default is four spaces""",
default=" ",
)

args = parser.parse_args()

# Scan LaTeX files for citations and generate set containing BibTeX entry
# labels
e_set = set()
for file in args.files:
output = subprocess.run(['./ext.sh', shlex.quote(file)],
capture_output=True
)
if(output.returncode == 0):
e_set = e_set.union(set(output.stdout.decode().rstrip().split('\n')))

# Merge custom BibTeX entry labels into main set if any
if(args.entries is not None):
with open(args.entries, 'r') as en:
while(entry := en.readline()):
e_set = e_set.union(entry)

parser = BibTexParser()
parser.ignore_nonstandard_types = False
with open(args.bib) as bibtex_file:
bib_database = bibtexparser.load(bibtex_file, parser)

e_dict = bib_database.entries_dict
out_db = BibDatabase()

entrycount = 0
found = 0
for entry in e_set:
entrycount += 1
if entry in e_dict.keys():
if not args.quiet:
print(f"Found: {entry}")
found += 1
out_db.entries.append(e_dict[entry])
else:
if not args.quiet:
print(f"Not found: {entry}")


writer = BibTexWriter()
writer.indent = args.indent
writer.add_trailing_comma = True
with open(args.output, 'w') as outfile:
bibtexparser.dump(out_db, outfile, writer)

if not args.quiet:
print(f"\nFound {found}/{entrycount} entries")

if __name__ == "__main__":
main()
128 changes: 128 additions & 0 deletions bibx.1
Original file line number Diff line number Diff line change
@@ -0,0 +1,128 @@
.\" Automatically generated by Pandoc 2.19.2
.\"
.\" Define V font for inline verbatim, using C font in formats
.\" that render this, and otherwise B font.
.ie "\f[CB]x\f[]"x" \{\
. ftr V B
. ftr VI BI
. ftr VB B
. ftr VBI BI
.\}
.el \{\
. ftr V CR
. ftr VI CI
. ftr VB CB
. ftr VBI CBI
.\}
.TH "BIBX" "1" "" "Version 1.0" "bibx Command Documentation"
.hy
.SH NAME
.PP
bibx \[em] Extract bibliography entries from a reference BibTeX file
using citations from LaTeX sources, to create project-specific BibTeX
files.
.SH SYNOPSIS
.PP
bibx [-h] -b BIB [-o OUTPUT] [-e ENTRIES] [-q] [-i INDENT] files [files
\&...]
.SH DESCRIPTION
.PP
This program extracts citations made in LaTeX source files and generates
a BibTeX file with entries that are found.
In order to generate the BibTeX file, a single \[lq]global\[rq] BibTeX
file that acts as a database to look for matching entries.
Output is written to `ext.bib' by default.
Requires at least one LaTeX source.
.PP
If extra entries other than those found from the LaTeX sources are to be
added to the output file, it can be done by using the -e option.
The file must contain a single BibTeX label on each line.
.PP
For example:
.IP
.nf
\f[C]
label1
label2
label3
\f[R]
.fi
.SH OPTIONS
.PP
-h, \[en]help
.IP
.nf
\f[C]
Show help and exit
\f[R]
.fi
.PP
-b BIB, \[en]bib BIB
.IP
.nf
\f[C]
Specify global BibTeX file to look for entries in
\f[R]
.fi
.PP
-o OUTPUT, \[en]output OUTPUT
.IP
.nf
\f[C]
Specify file to write the BibTeX output to. Defaults to \[aq]./ext.bib\[aq].
\f[R]
.fi
.PP
-e ENTRIES, \[en]entries ENTRIES
.IP
.nf
\f[C]
Specify file containing extra entries names to be searched and included in
the final output.
\f[R]
.fi
.PP
-i INDENT, \[en]indent INDENT
.IP
.nf
\f[C]
Specify the character(s) to be used to indent the BibTeX entries in the
final output. Defaults to four spaces. Example: -i \[dq] \[dq] (use two spaces).
\f[R]
.fi
.PP
-q, \[en]quiet
.IP
.nf
\f[C]
Suppress output to stdout
\f[R]
.fi
.SH EXAMPLE
.PP
Assume that there are two LaTeX files: \f[V]a.tex\f[R] and
\f[V]b.tex\f[R], the reference BibTeX file is \f[V]g.bib\f[R] and we
want the output file to be \f[V]refs.bib\f[R].
.PP
The command to achieve this would be:
.IP
.nf
\f[C]
bibx -b g.bib -o refs.bib a.tex b.tex
\f[R]
.fi
.SH BUGS
.PP
See GitHub issues: https://github.com/CodePurble/bibextract/issues
.SH AUTHOR
.PP
Ramprakash C (https://github.com/CodePurble)
.SH COPYRIGHT
.PP
Free use of this software is granted under the terms of the GPLv3
License.
.PP
See: https://github.com/CodePurble/bibextract/blob/master/LICENSE
.SH SEE ALSO
.PP
bibtex(1), latex(1)
Loading

0 comments on commit 1394e8f

Please sign in to comment.