[VAL-101] OS build fail on arm Created: 10/Mar/20  Updated: 13/Apr/20  Resolved: 13/Apr/20

Status: Done
Project: Validation
Component/s: None
Affects Version/s: None
Fix Version/s: None

Type: Bug Priority: Medium
Reporter: Daniel Stoica Assignee: Cristina Pauna
Resolution: Done Votes: 0
Labels: None
Remaining Estimate: Not Specified
Time Spent: Not Specified
Original Estimate: Not Specified


 Description   

OS build fail on arm when trying to fetch vuls database.

The build failed at the command : for i in $(seq 2002 "$(date +"%Y")"); do go-cve-dictionary fetchnvd -http-proxy=${HTTP_PROXY} -dbpath /opt/akraino/validation/tests/os/vuls/cve.sqlite3 -years "$i"; done
With error : make[1]: *** [.build] Error 137



 Comments   
Comment by Alexandru Avadanii [ 13/Apr/20 ]

Docker build seems to be stable on arm64 now [1], after implementing the above caching mechanism proposed + moving the docker build jobs from `aarch64_dev` to `aarch64_build` jenkins slaves (to eliminate the overlapping of IEC fuel/compass deploy/testing jobs with the validation docker build job, which used to cause issues due to deploy jobs restarting the Docker service on the host).

[1] https://jenkins.akraino.org/view/validation/job/validation-docker-build-arm64-master/

Comment by Alexandru Avadanii [ 07/Apr/20 ]

The error in this ticket does not seem to be caused by an incomplete PATH, but rather out-of-memory spurious failures during database fetch/update using `go-cve-dictionary fetchnvd`.

To bypass the issue (considering a ~50% reproduction rate of the issue during manual testing with a ~1h test iteration duration, debugging and rootcausing this is non-trivial), we chose to leverage the fact that all affected builds are scheduled on static Jenkins slaves, so we can cache the databases locally on the build server between container builds, reducing the fetch size considerably and thus avoiding OOM issues.

I proposed a patch for implementing this persistent cache and will manually initialize the DB cache on the `aarch64_dev` jenkins slaves.

Comment by Cristina Pauna [ 03/Apr/20 ]

Hi Juha, you are correct, this should not be closed yet. The build passed on its own a couple of days after the Jira was opened, and I thought it's fixed. But looking at the jobs, this seems to be an intermittent error. It might have to do something with the servers we build the images on, not with the code itself. We'll continue to investigate what's going on.

Comment by Juha Kosonen [ 03/Apr/20 ]

cristinapauna The patch referred has not been merged; is this corrected somewhere else or correction not needed at all?

Comment by Daniel Stoica [ 10/Mar/20 ]

https://gerrit.akraino.org/r/c/validation/+/2294

Generated at Sat Feb 10 06:07:19 UTC 2024 using Jira 9.4.5#940005-sha1:e3094934eac4fd8653cf39da58f39364fb9cc7c1.