Best Practices for Site Structure and Backups
Backup can be time-consuming. There are 3 types of files/dirs:
- Files with custom codes. The software you have developed
- Dirs with data. This could be important if data was hard to collect or replaceable if it was scraped or also exists elsewhere.
- Dirs with codes that were installed. Example: Python environment. pip installed libs. Last prio because easiest to re-do. Still a pain because took some time and is working.
If you have to do a quick backup. For data, keep 2 types of dirs, easy to get data, tough to get data.
Create a dir structure so custom code and custom data can be backed up without necessarily backing up easier-to-get (scraped data) and installed libs.
Current result: 4 dirs
Custom code
Custom data
Easy to get code
Easy to get data
All of above is for the code and content related to the site.
Next, also have servers and config files.
Config files are customized.
Next result for backup priorities:
Custom config files for servers
Servers
Transition to providing services in containers on virtual instances. Transition to keeping databases on virtual instances.
Goal: Install spark on Ubuntu VM launched via Multipass
Error in installing Java:
/var/cache/apt/archives/openjdk-11-jdk-headless_11.0.24+8–1ubuntu3~24.04.1_arm64.deb
needrestart is being skipped since dpkg has failed
E: Sub-process /usr/bin/dpkg returned an error code (1)