Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[TRANSLATION] Add 7_inference and 8_agents EN-JA translation #194

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

eltociear
Copy link
Contributor

@eltociear eltociear commented Jan 22, 2025

December 2024 Student Submission

Module Completed

  • Module 1: Instruction Tuning
  • Module 2: Preference Alignment
  • Module 3: Parameter-efficient Fine-tuning
  • Module 4: Evaluation
  • Module 5: Vision-language Models
  • Module 6: Synthetic Datasets
  • Module 7: Inference
  • Module 8: Deployment

Notebooks Added/Modified

List any notebooks you've added or modified:

  • Added new example in module_name/student_examples/my_example.ipynb
  • Modified existing notebook with additional examples
  • Added documentation or comments

Checklist

  • I have read the module materials
  • My code runs without errors
  • I have pushed models and datasets to the huggingface hub
  • My PR is based on the december-2024 branch

Questions or Discussion Points

├── ja                     # I Added  
│   ├── 7_inference         
│   ├── 8_agents
│   └── ...
├── 7_inference   # Original directory       
└── 8_agents

Additional Notes

Any other information that might be helpful for reviewers:
The rest of the Japanese documentation has been added!

@burtenshaw
Copy link
Collaborator

burtenshaw commented Jan 24, 2025

Nice work on getting the whole course translated! Let me know when it's ready for review and I'll find a Japanese speaking reviewer.

@eltociear
Copy link
Contributor Author

@burtenshaw
Thank you! This content is ready for review!

@burtenshaw burtenshaw requested a review from hysts January 27, 2025 08:05
Copy link
Collaborator

@hysts hysts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@eltociear
Here are a few things I noticed during a quick review.

  1. Some parts of the text are garbled. For example, https://github.com/huggingface/smol-course/pull/194/files#diff-d5b5978c093eca5d6a60365e44513fc144c11d5986666949091302ddfe455dcbR57
    This notebook also has garbled text:
    https://github.com/huggingface/smol-course/pull/194/files#diff-bb7f6ee82642beac48a2a4a84458982ba2c05be28a0f9b0a26aaea446c2069a0
    The garbled text is also seen in your other PRs. For example,

    | DPOトレーニング | 直接選好最適化を使用してモデルを��レーニングする方法を学ぶ | 🐢 AnthropicのHH-RLHFデータセットを使用してモデルをトレーニングする<br>🐕 独自の選好データセットを使用する<br>🦁 さまざまな選好データセットとモデルサイズで実験する | [ノートブック](./notebooks/dpo_finetuning_example.ipynb) | <a target="_blank" href="https://colab.research.google.com/github/huggingface/smol-course/blob/main/2_preference_alignment/notebooks/dpo_finetuning_example.ipynb"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a> |

  2. These lines from the original text are missing in the translation:

    ## Building Blocks of a Code Agent
    Code agents are built on specialized language models fine-tuned for code understanding. These models are augmented with development tools like linters, formatters, and compilers to interact with real-world environments. Through retrieval techniques, agents maintain contextual awareness by accessing documentation and code histories to align with organizational patterns and standards. Action-oriented functions enable agents to perform concrete tasks such as committing changes or initiating merge requests.

  3. This one is about another PR of yours [TRANSLATION] Add 1_instruction_tuning and 2_preference_alignment EN-JA translation #176, but the content of the two notebooks below is identical, except for garbled text in one of them:

    The former should be a translation of this notebook:
    https://github.com/huggingface/smol-course/blob/a5cc73e2e0a9df77d2c34369314431c94674a5dd/1_instruction_tuning/notebooks/chat_templates_example.ipynb

  4. It seems that you've also translated the input text for the models into Japanese in code blocks. But many of the models used in this course are not good at handling Japanese, which can result in nonsensical output. So, translating the input text itself might not be appropriate. Would it be possible to include the translation as a comment instead?

@burtenshaw
Copy link
Collaborator

Thanks for the review @hysts!

@eltociear could you respond to the comments across all unit in this PR please.

@eltociear
Copy link
Contributor Author

@hysts @burtenshaw
Thank you!
Please wait a moment while we confirm and respond to your content.

Copy link
Member

@whitphx whitphx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi, thank you for the contribution!
Just let me point out the translations of proper nouns to avoid confusion, in addition to @hysts 's review above.

ja/7_inference/inference_pipeline.md Outdated Show resolved Hide resolved
ja/7_inference/text_generation_inference.md Outdated Show resolved Hide resolved

### 1️⃣ [取得エージェント](./retrieval_agents.md)

取得エージェントは、モデルと知識ベースを組み合わせます。これらのエージェントは、ベクトルストアを使用して効率的に取得し、RAG(取得強化生成)パターンを実装することで、複数のソースから情報を検索および合成できます。これらは、メモリシステムを通じて会話のコンテキストを維持しながら、Web検索とカスタム知識ベースを組み合わせるのに優れています。このモジュールでは、堅牢な情報取得のためのフォールバックメカニズムを含む実装戦略を説明します。
Copy link
Member

@whitphx whitphx Jan 28, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Japanese translation of RAG is "検索拡張生成", not "取得強化生成"
Ref: https://aws.amazon.com/jp/what-is/retrieval-augmented-generation/, https://www.nri.com/jp/knowledge/glossary/rag.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants