From 41a97c8fef5a3ead4f4fae7273d7391d7e58bc2d Mon Sep 17 00:00:00 2001 From: Titusz Pan Date: Sat, 19 Oct 2024 18:26:22 +0200 Subject: [PATCH] Add coding conventions --- CONVENTIONS.md | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 CONVENTIONS.md diff --git a/CONVENTIONS.md b/CONVENTIONS.md new file mode 100644 index 0000000..a464145 --- /dev/null +++ b/CONVENTIONS.md @@ -0,0 +1,25 @@ +# Coding Convetions + +- Write pragmatic, easily testable, and performant code! +- Prefer short and pure functions where possible! +- Keep the number of function arguments as low as possible! +- DonĀ“t use nested functions! +- Write concise and to-the-point docstrings for all functions! +- Use type comments style (PEP 484) instead of function annotations! +- Always add a correct PEP 484 style type comment as the first line after the function definition! +- Use built-in collection types as generic types for annotations (PEP 585)! +- Use the | (pipe) operator for writing union types (PEP 604)! + +Example function with type annotations and docstring: + +```python +def tokenize_chunks(chunks, max_len=None): + # type: (list[str], int|None) -> dict + """ + Tokenize text chunks into model-compatible formats. + + :param chunks: Text chunks to tokenize. + :param max_len: Truncates chunks above max_len characters + :return: Dictionary of tokenized data including input IDs, attention masks, and type IDs. + """ +```