Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Agama auto install on Bare Metal server #20777

Merged
merged 1 commit into from
Dec 18, 2024

Conversation

rfan1
Copy link
Contributor

@rfan1 rfan1 commented Dec 11, 2024

https://progress.opensuse.org/issues/173758

  • Verification run:

aarch64: http://10.200.129.6/tests/81288
x86_64:
root disk: sda
root disk: nvme0n1

Now vnc console access during the installation has some problems, see https://progress.opensuse.org/issues/170434

So adding some workarounds to configure the serial console and root ssh login before rebooting the host from harddisk after installation

Currently, we are still in early phase of agama, so some new changes are expected in next few weeks

@rfan1 rfan1 force-pushed the ipxe_sle16 branch 2 times, most recently from 0f6e486 to ed54fae Compare December 11, 2024 04:00
@lemon-suse
Copy link
Contributor

In general LGTM, just need to decide to choose solution to match the installation finished page, the situation like s390x and powerVM.

@alice-suse
Copy link
Contributor

alice-suse commented Dec 11, 2024

Thanks for providing the PR.

aarch64:http://openqa.qa2.suse.asia/tests/80593
x86_64: https://openqa.suse.de/tests/16155480

Both verification jobs are red. Is this expected?

Besides, it will be good to add regression test for other supported installation ways in ipxe_install.pm, eg SLM 6.1 pxe boot, SLES pxe boot, and also aarch64 and SLM6.0 usb boot when test machines are available again(NUE2 arm machines and machines with usb(x86) are not usable after CC isolation).

@alice-suse alice-suse closed this Dec 11, 2024
@alice-suse alice-suse reopened this Dec 11, 2024
@alice-suse
Copy link
Contributor

Sorry for the accidental PR close. I closed a comment but it turned out as PR close. Reopen it.

@alice-suse
Copy link
Contributor

In general LGTM, just need to decide to choose solution to match the installation finished page, the situation like s390x and powerVM.

@lemon-suse @jknphy What's the best solution for now and in long term?

@lemon-suse
Copy link
Contributor

In general LGTM, just need to decide to choose solution to match the installation finished page, the situation like s390x and powerVM.

@lemon-suse @jknphy What's the best solution for now and in long term?

For now we are working on puppeteer to match the auto installation finished page, PR is on reviewing. I'm not sure it is the best solution but for us is a good solution. Besides that, we have thought to loop listen the agama log to know the installation finished, but that is not so stable and complicated, so we drop that. Joaquin will share you his idea for the long term solution. :)

@alice-suse
Copy link
Contributor

In general LGTM, just need to decide to choose solution to match the installation finished page, the situation like s390x and powerVM.

@lemon-suse @jknphy What's the best solution for now and in long term?

For now we are working on puppeteer to match the auto installation finished page, PR is on reviewing. I'm not sure it is the best solution but for us is a good solution. Besides that, we have thought to loop listen the agama log to know the installation finished, but that is not so stable and complicated, so we drop that. Joaquin will share you his idea for the long term solution. :)

Thank you @lemon-suse ! A workable stable solution is acceptable as temporary solution, IMHO.

BTW, just to double confirm my understanding is correct, so something like https://openqa.suse.de/tests/16082855#step/agama_auto/26 is not available for ipmi machines, and what we are talking about is solution to workaround it, right? Would you please share the puppeteer PR? I am interested to have a look :)

@lemon-suse
Copy link
Contributor

In general LGTM, just need to decide to choose solution to match the installation finished page, the situation like s390x and powerVM.

@lemon-suse @jknphy What's the best solution for now and in long term?

For now we are working on puppeteer to match the auto installation finished page, PR is on reviewing. I'm not sure it is the best solution but for us is a good solution. Besides that, we have thought to loop listen the agama log to know the installation finished, but that is not so stable and complicated, so we drop that. Joaquin will share you his idea for the long term solution. :)

Thank you @lemon-suse ! A workable stable solution is acceptable as temporary solution, IMHO.

BTW, just to double confirm my understanding is correct, so something like https://openqa.suse.de/tests/16082855#step/agama_auto/26 is not available for ipmi machines, and what we are talking about is solution to workaround it, right? Would you please share the puppeteer PR? I am interested to have a look :)

For ipmi agama test VNC not work also same as s390x and powerVM, so we are working on a solution to fix it. The puppeteer PR is jknphy/agama-integration-test-webpack#46.

@jknphy
Copy link
Contributor

jknphy commented Dec 11, 2024

The best solution is to wait for a tool that the developers will provide to monitor the installation, but for now each squad can pick whatever it fits (no need to block each other):
https://progress.opensuse.org/issues/173758#note-16

@rfan1 rfan1 force-pushed the ipxe_sle16 branch 7 times, most recently from a36014e to a224d88 Compare December 12, 2024 02:40
tests/installation/ipxe_install.pm Outdated Show resolved Hide resolved
tests/installation/ipxe_install.pm Outdated Show resolved Hide resolved
}
if (is_ipmi) {
my $sol_console = is_aarch64 ? get_var('SERIALDEV', 'ttyAMA0') : get_var('SERIALDEV', 'ttyS1');
$cmdline_extra .= "console=$sol_console,115200 linuxrc.log=/dev/$sol_console linuxrc.core=/dev/$sol_console linuxrc.debug=4,trace ";
Copy link
Contributor

@czerw czerw Dec 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use IPXE_CONSOLE variable and don't use hardcoded speed values.

tests/installation/ipxe_install.pm Show resolved Hide resolved
tests/installation/ipxe_install.pm Outdated Show resolved Hide resolved
tests/installation/ipxe_install.pm Show resolved Hide resolved
Copy link

Great PR! Please pay attention to the following items before merging:

Files matching lib/**.pm:

  • Consider adding or extending unit tests in t/

This is an automatically generated QA checklist based on modified files.

lib/version_utils.pm Outdated Show resolved Hide resolved
@rfan1 rfan1 force-pushed the ipxe_sle16 branch 2 times, most recently from 1bb3699 to 8e94d48 Compare December 12, 2024 12:03
@rfan1 rfan1 changed the title [wip]Agama BM Enalbe Agama auto install on Bare Metal server Dec 17, 2024
@frankenmichl
Copy link
Member

Could you please try to do verification runs for WORKER_CLASS=64bit-i915_poo168097 and WORKER_CLASS=64bit-amd-ltp-tyrion_poo168097 on OSD?

Could you please also add verification on WORKER_CLASS=64bit-ipmi-large-mem and 64bit-ipmi-uefi? Thanks!

WORKER_CLASS=64bit-ipmi-large-mem-> passed WORKER_CLASS=64bit-ipmi-uefi-> failed due to https://progress.opensuse.org/issues/174448

WORKER_CLASS=64bit-i915_poo168097 and WORKER_CLASS=64bit-amd-ltp-tyrion_poo168097: failed, I would say due to CC, host is in NUE, but ipxe server is in PRG

You specified the wrong server. The machines have the server in their definition on OSD, which is IPXE_HTTPSERVER=http://baremetal-support.qa.suse.de:8080 - that was the original one. For the machines located in nue2 we have that defined along the machine definition.
We can go ahead without these verification runs I think.

@rfan1 rfan1 force-pushed the ipxe_sle16 branch 2 times, most recently from d7e2caf to 54aae95 Compare December 17, 2024 08:54
@rfan1 rfan1 changed the title Enalbe Agama auto install on Bare Metal server Enable Agama auto install on Bare Metal server Dec 17, 2024
variables.md Outdated Show resolved Hide resolved
variables.md Outdated Show resolved Hide resolved
@alice-suse
Copy link
Contributor

Could you please try to do verification runs for WORKER_CLASS=64bit-i915_poo168097 and WORKER_CLASS=64bit-amd-ltp-tyrion_poo168097 on OSD?

Could you please also add verification on WORKER_CLASS=64bit-ipmi-large-mem and 64bit-ipmi-uefi? Thanks!

WORKER_CLASS=64bit-ipmi-large-mem-> passed WORKER_CLASS=64bit-ipmi-uefi-> failed due to https://progress.opensuse.org/issues/174448

Thanks! For 64bit-ipmi-uefi, you can specify with worker_class='bare-metal1', this one works now.

@rfan1
Copy link
Contributor Author

rfan1 commented Dec 18, 2024

Could you please try to do verification runs for WORKER_CLASS=64bit-i915_poo168097 and WORKER_CLASS=64bit-amd-ltp-tyrion_poo168097 on OSD?

Could you please also add verification on WORKER_CLASS=64bit-ipmi-large-mem and 64bit-ipmi-uefi? Thanks!

WORKER_CLASS=64bit-ipmi-large-mem-> passed WORKER_CLASS=64bit-ipmi-uefi-> failed due to https://progress.opensuse.org/issues/174448

Thanks! For 64bit-ipmi-uefi, you can specify with worker_class='bare-metal1', this one works now.

passed :) http://openqa.suse.de/tests/16233654#

@rfan1 rfan1 force-pushed the ipxe_sle16 branch 4 times, most recently from 224ddd3 to 51ba290 Compare December 18, 2024 04:59
Copy link
Contributor

@alice-suse alice-suse left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the effort!

@rfan1
Copy link
Contributor Author

rfan1 commented Dec 18, 2024

Thanks all for your comments and reviews! I plan to merge this PR now. and make it a milestone. as I can't solve all problem from this single commit, some new agama features will come soon [scc support/post scripts etc], there are still some pending issues/bugs like:
https://bugzilla.suse.com/show_bug.cgi?id=1234678 and https://bugzilla.suse.com/show_bug.cgi?id=1234264.

@Dawei-Pang
Copy link
Contributor

So far agama installer cannot reboot automatically although agama.auto added in kernel boot command line, could this codes check and confirm if the installation complete? Thanks!

Screen Shot 2024-12-18 at 16 28 55

@rfan1
Copy link
Contributor Author

rfan1 commented Dec 18, 2024

So far agama installer cannot reboot automatically although agama.auto added in kernel boot command line, could this codes check and confirm if the installation complete? Thanks!

Screen Shot 2024-12-18 at 16 28 55

AGAMA_AUTO | string | | The auto-installation is started by passing agama.auto=<url> on the kernel's command line,
So IMO it doesn't control auto reboot :)

@Dawei-Pang
Copy link
Contributor

So far agama installer cannot reboot automatically although agama.auto added in kernel boot command line, could this codes check and confirm if the installation complete? Thanks!

I find the code verify_agama_auto_install_done_cmdline() implement this feature to confirm installation complete.

@rfan1 rfan1 merged commit 6ed938f into os-autoinst:master Dec 18, 2024
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.