Skip to content
This repository has been archived by the owner on Jan 23, 2024. It is now read-only.

apm-agent-nodejs integration tests are failing because of slow npm install from github #1483

Closed
trentm opened this issue May 16, 2022 · 0 comments · Fixed by #1484
Closed
Assignees

Comments

@trentm
Copy link
Member

trentm commented May 16, 2022

"apm-agent-Node.js" builds of the "APM Integrfation Test MBP Selector" starting failing recently (in the past few days).

For example https://apm-ci.elastic.co/job/apm-integration-tests-selector-mbp/job/main/3832/

[2022-05-16T16:33:58.723Z] Creating expressapp ... 
[2022-05-16T16:33:58.723Z] Creating localtesting_8.2.0_elasticsearch ... 
[2022-05-16T16:33:59.326Z] Creating expressapp                       ... done
[2022-05-16T16:34:00.331Z] Creating localtesting_8.2.0_elasticsearch ... done
[2022-05-16T16:34:41.007Z] Creating localtesting_8.2.0_kibana        ... 
[2022-05-16T16:34:41.810Z] Creating localtesting_8.2.0_kibana        ... done
[2022-05-16T16:35:25.791Z] Creating localtesting_8.2.0_apm-managed   ... 
[2022-05-16T16:35:26.596Z] Creating localtesting_8.2.0_apm-managed   ... done
[2022-05-16T16:36:01.028Z] 
[2022-05-16T16:36:01.028Z] ERROR: for wait-service  Container "2a1da3ae952a" is unhealthy.

repro and cause

The build includes docker inspect ... dumps for each of the containers in its artifacts. The failing container here is the "expressapp" container, failing with:

            "Health": {
                "Status": "unhealthy",
                "FailingStreak": 12,
                "Log": [
                    {
                        "Start": "2022-05-16T16:35:20.209453588Z",
                        "End": "2022-05-16T16:35:20.348608766Z",
                        "ExitCode": 7,
                        "Output": "'HTTP 000'"
                    },

That's an odd output. We were able to reproduce locally with these steps from the docs.txt artifact.

export ELASTIC_STACK_VERSION=8.2.0
export BUILD_OPTS="--nodejs-agent-package elastic/apm-agent-nodejs#b52b2d42a9e58cd15cabad4cc3ca74fcdd82bfaf --opbeans-node-agent-branch b52b2d42a9e58cd15cabad4cc3ca74fcdd82bfaf"
.ci/scripts/agent.sh nodejs nodejs-express

When digging into the "expressapp" container while it is failing ps -ef shows:

# ps -ef
UID        PID  PPID  C STIME TTY          TIME CMD
root         1     0  0 17:44 ?        00:00:00 bash -c npm install elastic/apm-agent-nodejs#a34d5892 && node app.js
root         8     1  1 17:44 ?        00:00:00 npm
root        19     8  0 17:44 ?        00:00:00 git ls-remote -h -t git://github.com/elastic/apm-agent-nodejs.git
root        50     0  3 17:45 pts/0    00:00:00 /bin/sh
root        59    50  0 17:45 pts/0    00:00:00 ps -ef

That npm install elastic/apm-agent-nodejs#a34d5892 is not completing before the container is deemed unhealthy and is destroyed. Hence the app is never running. (I guess that "HTTP 000" response is something being handled by some Docker proxy; not sure, but it doesn't matter.)

Running just that command locally shows that npm install <from any git repo> using npm v6 (the npm in node v12 and v14) takes many minutes (almost 8 minutes here):

% time npm install elastic/apm-agent-nodejs#a34d5892
...
npm install elastic/apm-agent-nodejs#a34d5892  12.26s user 4.07s system 3% cpu 7:56.75 total

npm/cli#4896 suggests this is an issue that started on Friday the 13th.
image

workarounds

So, some possible work arounds for this:

  1. Switch to node v16, which gives us npm v8 which doesn't have this issue, AFAIK.
  2. Add this git config: git config url."ssh://git@".insteadOf git://. Then switch to this argument for the agent install: npm install git+ssh://git@github.com:elastic/apm-agent-nodejs.git#$COMMITSHA. (This is a workaround described by one user on the above npm issue.)
  3. Stop using npm install <something referring to a git repo> for installing a particular commit. Instead we could perhaps do a separate git clone ... && git checkout && npm install ../from/that/git/clone/dir.

Doing #1 is by far the easiest right now, and updating the expressapp to use the current node v16 is definitely reasonable.

@trentm trentm self-assigned this May 16, 2022
@trentm trentm changed the title apm-agent-nodejs integration tests are failing apm-agent-nodejs integration tests are failing because of slow npm install from github May 16, 2022
trentm added a commit that referenced this issue May 16, 2022
…l from github

The switch to node v16 gets use npm v8, to workaround an issue with
slow 'npm install <any github repo dependency>'. See:
    npm/cli#4896

In our case the github repo dependency was the command given to docker
run this container:
    bash -c "npm install elastic-apm-node#SOME-COMMIT-SHA && node app.js"

This also adds a package.json to more explicitly declare we are working
with a node project workspace. Also avoid generating a package-lock file
we won't use.

Fixes: #1483
trentm added a commit that referenced this issue May 16, 2022
…l from github (#1484)

The switch to node v16 gets use npm v8, to workaround an issue with
slow 'npm install <any github repo dependency>'. See:
    npm/cli#4896

In our case the github repo dependency was the command given to docker
run this container:
    bash -c "npm install elastic-apm-node#SOME-COMMIT-SHA && node app.js"

This also adds a package.json to more explicitly declare we are working
with a node project workspace. Also avoid generating a package-lock file
we won't use.

Fixes: #1483
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
1 participant