Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some functions don't have code, instead have "// method implementation..." #53

Closed
dangmanhtruong1995 opened this issue Dec 5, 2024 · 5 comments · Fixed by #55
Closed

Comments

@dangmanhtruong1995
Copy link

Hi, thanks a lot for this repo! However I have noticed that when I used on Typescript files, then in some functions, the code is replaced with: "// method implementation...".
Do you know how to fix it ? Thank you.

@fynnfluegge
Copy link
Owner

Hi @dangmanhtruong1995 thanks for reporting! I think the llm sometimes returns only the doc without function to save tokens. Quick fix would be to tell the llm explicitly to return also the method body. Better fix would inserting the doc only and tell the llm to NOT respond with method body.
I don't actively develop this anymore, but still open to review and merge PRs if it looks good

@dangmanhtruong1995
Copy link
Author

Hi @dangmanhtruong1995 thanks for reporting! I think the llm sometimes returns only the doc without function to save tokens. Quick fix would be to tell the llm explicitly to return also the method body. Better fix would inserting the doc only and tell the llm to NOT respond with method body. I don't actively develop this anymore, but still open to review and merge PRs if it looks good

Hi @fynnfluegge, I would love to do PRs for this. However I'm not too sure about the following: For languages like Typescript, the docstring are usually separate from the code, which can be processed nicely. However for languages like Python, the docstring are right below the function definition and above the main code, which might require special processing. Or can the TreeSitter library handle this for us (I have never used it)? I hope you can give me some advice regarding this.

@fynnfluegge
Copy link
Owner

fynnfluegge commented Dec 9, 2024

Hi @dangmanhtruong1995 thanks for tackling it! True, it requires different handling then. At the moment the whole method is replaced together with the documentation.

It is done here https://github.com/fynnfluegge/doc-comments-ai/blob/main/doc_comments_ai/utils.py

def write_code_snippet_to_file(
    file_path: str, original_code: str, modified_code: str
) -> None:
    """
    Writes a modified version of a code snippet to a file.

    Args:
        file_path (str): The path of the file to write to.
        original_code (str): The original code snippet to be replaced.
        modified_code (str): The modified code snippet.
    """
    with open(file_path, "r", encoding="utf-8") as file:
        file_content = file.read()
        start_pos = file_content.find(original_code)
        if start_pos != -1:
            end_pos = start_pos + len(original_code)
            indentation = file_content[:start_pos].split("\n")[-1]
            modeified_lines = modified_code.split("\n")
            indented_modified_lines = [indentation + line for line in modeified_lines]
            indented_modified_code = "\n".join(indented_modified_lines)
            modified_content = (
                file_content[:start_pos]
                + indented_modified_code
                + file_content[end_pos:]
            )
            with open(file_path, "w", encoding="utf-8") as file:
                file.write(modified_content)

So either this logic can be changed to only insert the documentation by taking different languages into account or modifying the prompt in https://github.com/fynnfluegge/doc-comments-ai/blob/main/doc_comments_ai/llm.py as a quick fix

@anshulthakur
Copy link
Contributor

Hi, I think my PR #55 should fix this issue. Sometimes, the LLM returns more than one markdown blocks, the first one being as @dangmanhtruong1995 had mentioned. The subsequent one is the actual code block. The regex match does not capture the second code block because of its lookahead and non-greedy match. By constraining the LLM to return a single code, this problem does not arise.

I've tested this for python code where I was suffering the same symptoms. But, given the nature of the problem is the same, may work for typescript too.

@fynnfluegge
Copy link
Owner

@anshulthakur thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants