wiki/notebook/literature.suse-cloud-native-fundamentals-scholarship-program.org
Gabriel Arazas b088086b06 Merge evergreen notes into the notebook
Now, it's all under the notebook umbrella. Seems to be appropriate as it
is just my notes after all.

I also updated some notes from there. I didn't keep track of what it is
this time. Something about more learning notes extracted from my
"Learning how to learn" course notes and then some. Lack of time and
hurriness just makes it difficult to track but it should be under
version control already.
2021-07-21 16:28:07 +08:00

10 KiB

SUSE Cloud native fundamentals scholarship program

A course from Udacity with SUSE from their scholarship program. If you're interested in the exercises for other solutions, see Solutions to SUSE Cloud native fundamentals scholarship exercises.

Some things to keep in mind:

Lesson 1 and 2

  • There is a dedicated organization for cloud-related called the Cloud Native Computing Foundation (CNCF). Among the list of projects, they have Kubernetes and Docker.
  • While cloud-native tools are widely adopted, you need to consider the stakeholders. From a business perspective, it needs to represent agility and growth. From a technical perspective, it needs to present automation and observability.
  • Application architecture:

    • One way to construct an application is with the monolith structure where all of the components are stored in a monorepo.
    • Another is through microservice architecture where each component can live in its own repository.
Monolith Microservices
Single repository Multiple repositories
Sequential Concurrent
Has to be reproduced holistically Only reproduces the component
One delivery pipeline Multiple pipelines

Best practices for application deployment:

  • health checks enable easier way of making sure each component is functional; usually implemented as an HTTP endpoint (e.g., /status/)
  • metrics track the individual counts for each component from the number of logins, how many times the user has reach the interface, etc.
  • logs are the most important component enabling easier debugging; this process could be passively printing to stdout/stderr or actively logs through a log handler; oftentimes, log levels are used to easily identify the type of information a developer may need — e.g., debug, info, warn, error, fatal; furthermore, it is often recommended to attach a log with a timestamp
  • tracing enables easier debugging by recording certain information to make reproducible steps; oftentimes, these involves recording the number of times to call a function or indicating whether it entered a state
  • implementing to record resource consumption makes sure that the application is ready to fight while not going above certain limits; these often involves knowing the upper boundary of the network bandwidth and CPU consumption, creating benchmarks or stress tests for the application
  • to make it easier to memorize, know the word "HuMLTR"

Lesson 3

  • containers versus virtual machines

    • virtual machines are software that emulates the full stack of a computer including the operating system, applications, and hardware
    • containers are similar to virtual machines except they have less components — e.g., there is no need for hardware emulation, the operating system is removed
    • by their nature, containers are lightweight compared to virtual machines
    • nowadays, both features make use of OS-level virtualization — e.g., chroot, FreeBSD jails, Linux containers (LXC)
  • Docker is one of the more popular platforms for containers; as a platform, it is composed of a daemon that has to be activated first before you can do container stuff and also hosts a large library of images to make as a foundation
  • an application can be packaged with Docker images as a distribution method — e.g., ArchiveBox
  • container workflow: configuring what image to create, then build an image from the configuration (i.e., Dockerfile), run the image for testing purposes, tagging it, then pushing the image to a registry
  • while managing containers is viable, you may have to manage a lot of containers; while possible, it does scale a lot for the developer; for this, you can use a container orchestrator
  • among the list of orchestrators, we have Docker Swarm, Apache Mesos, and Kubernetes among others; but for this topic, we'll go with Kubernetes as it is popular and also goes with the building blocks structure (a set of primitive blocks that can be customized to create more complex operations)
  • Kubernetes manages several resources:

    • it manages a cluster, the entire group of nodes where each node (service or master)
    • worker nodes is a group of pods
    • pods are a group of containers
    • deployments indicate the state of the pod
    • replica sets ensure at least one similar pod is running at a time
    • services
    • ConfigMaps handles pod configurations
    • namespaces provide separation of logic between different applications, that is, application context

Lesson 4

  • small organizations do not often have the resources to manage their own clusters thus often use third-party providers; while some organizations do have the resources, the scope to manage clusters often overshadow the team managing it
  • in cases like those, it make sense to use Platform-as-a-Service platforms (PaaS)
  • there are different components in a technology stack that an organization can manage: starting from servers, storage, virtualization, operating system, runtime, data, and application
  • several providers are offering the above components and usually go into several umbrellas:

    • the organization can choose no provider and instead manage their own with their own hardware, infrastructure, and all; having an on-premises instance lets the organization full control of the whole stack but it can take a while to setup before the team can manage their deployments
    • some organizations will choose a provider that offers a infrastructure-as-a-service (IaaS) — e.g., Google Cloud Platform, Amazon Web Services — just not providing a way to manage the hardware (e.g., servers, storage)
    • others will choose platform-as-a-service (PaaS), reducing further what the organization can manage, only the application and the data
    • some providers offer functions-as-a-service (FaaS) — e.g., Amazon Lambda, Netlify Lambda — that only invokes the application on-demand
  • the point is the more the organization has full control of, the more complexity the organization has to worry about; while they have full control over the stack, they have also to cover the full expense for managing it; having more control also means less chances of being vendor lock-in, thus, more flexibility with more options to add and remove features into the market
  • the less components they have control over, the less complexity; with less things to worry about, it takes less time to create prototypes and less overall expenses; but since they delegate the tasks to a third-party provider, they are vulnerable to vendor lock-in, thus, can only move from the limitations of the platform;

Lesson 5

  • most organizations want to polish the end-product as much as possible before it reach their customers; in software, most applications go through a variety of steps such as putting through a testing suite, building the image with reproducible results, and packaging it for enabling everyone be able to open the app, and deploying it (or an actual release)
  • to make application development faster, they usually divide it into several environments: one for development where they can change things with minimal risk and one for production, that is, the customer-facing interface
  • to make it easier, some of them delegate the usual tasks of testing, building, packaging, and deployment to the cloud
  • this has resulted into the usual practice of continuous integration (CI) and continuous delivery (CD)

    • CI is a tool for automating testing, building, and packaging the application
    • CD is a tool for automatic deployment of the application
  • there are many cloud services that offer CI/CD like CircleCI, Jenkins, Travis CI, Gitlab CI, and GitHub Actions

    • for this lesson, we'll go with GitHub Actions which is the integrated CI/CD workflow for GitHub repos
    • GitHub Actions enables event-listening jobs
    • each job are then divided into steps
    • you can configure GitHub Actions through .github/workflows inside of the GitHub repo
  • in this chapter, they go with ArgoCD which is an application deployment frontend to Kubernetes clusters
  • with similar configurations, you may want to use templates; this is where configuration managers come in; this course chooses Helm as the tool of the trade as it is integrated into ArgoCD
  • Helm is a configuration manager

    • it mainly manages charts which are basically packages in a package manager
    • a chart contains parameterized values which is mainly done through Go templates
    • Helm recognizes a chart with the Chart.yaml file that contains metadata about the deployment
  • ArgoCD can accept multiple sources including Helm, making it easy to integrate Helm into it
  • CI/CD tools can be categorized based on the push/pull methodology

    • traditionally, CI/CD platforms are push, delivering updates once changes have been pushed into the repository; the changes will also be pushed to the CD where it will create new clusters; while it makes update flow from new version easier, it also requires more resources and more awareness as it has more chance of a disruption, requiring more resources; on the other hand, this makes versioning easier making it easier to debug and analyze
    • pull-based CI/CD platforms is mostly similar except it will detect changes and apply them to the cluster; while this is easier to apply new changes, it also makes it harder to analyze for future improvements