K8s Jobs & cronJobs

Sandeep Baldawa
6 min readMay 24, 2021
Source

In the blog post, let’s try to understand what the K8s job object is and why do we need the same.

Types of Jobs

  1. Run to completion (Jobs)
Running jobs mostly for the specific instance of invocation without a schedule.

2. Scheduled (CronJob)

Running jobs at a scheduled instance of time

Why jobs? Why can't we just use a deployment to get our work done?

K8s Jobs are like a person on a mission, they do their work and return back to base. They can either do this one-off time or do this on a scheduled basis. More details here. Deployments try to keep the replicas alive & will bring up a new replica if some other replica goes down. So these are both opposite ends of a rainbow.

How to know what API's your K8s version supports?

This is a bit tangential, but important to know to understand all the APIs that your K8s installation supports.

  1. Run the below on command line
kubectl proxy --port=8080
Starting to serve on 127.0.0.1:8080

2. Now, if we go to http://127.0.0.1:8080/, we can see all the API's supported; specifically if we are looking for the job-related APIs, we can go to http://127.0.0.1:8080/apis/batch/v1

Alternatively, one could also try the below.

MyK8sInstance> kubectl api-resources --api-group=batch
NAME SHORTNAMES APIGROUP NAMESPACED KIND
cronjobs cj batch true CronJob
jobs batch true Job
MyK8sInstance> kubectl explain job | less =====> Explains the apiMyK8sInstance> kubectlexplain job --recursive | less ===> Parameters

This can be used to find details about any supported APIs.

All right, all good. Can we talk business here? Just tell me how to create a job

Let’s try an example of creating a job for an image busybox. What we are trying to do here is to invoke a job that will run a command in the busybox container. We will start with a dry run option here. See the various fields below, is there something missing 🤔

MyK8sInstance> kubectl create job --image=busybox  myjob  -o yaml --dry-run
apiVersion: batch/v1
kind: Job
metadata:
creationTimestamp: null
name: myjob
spec:
template:
spec:
containers:
- image: busybox
name: myjob

What command is the above job running, looks like we don't have it in there

Let’s add the command piece in there; also note Only an RestartPolicy equal to Never or OnFailure is allowed(which makes sense considering the purpose of why we need jobs for i.e., execute a task and move on, so there is no need to have the restart Policy be Always). Check docs here

apiVersion: batch/v1
kind: Job
metadata:
creationTimestamp: null
name: myjob
spec:
template:
spec:
containers:
- image: busybox
name: myjob
command: ["/bin/sh"]
args: ["-c", "echo Hello World"]
restartPolicy: OnFailure

Let’s now create the job; something interesting to note is the job after running does not delete the pod.

MyK8sInstance> kubectl apply -f job.yaml
job.batch/myjob configured
MyK8sInstance> kubectl get jobs
NAME COMPLETIONS DURATION AGE
myjob 1/1 3s 4m32s

But how do we know it did it’s work(Print Hello World)?

MyK8sInstance> kubectl get po | grep myjob
myjob-rhfhm 0/1 Completed 0 6m38s
MyK8sInstance> kubectl logs myjob-rhfhm
Hello World

What happens if I delete the job, will the pods related to it remain?

Not really; the pods associated with a job are also deleted once a job is deleted.

MyK8sInstance> kubectl delete job myjob
job.batch "myjob" deleted
MyK8sInstance> kubectl get po | grep myjob

How do we find all pods related to a given job?

Source:- here

MyK8sInstance> pods=$(kubectl get pods --selector=job-name=myjob --output=jsonpath='{.items[*].metadata.name}')MyK8sInstance> echo $pods
myjob-d7n8b

What if a job keeps failing, can we limit how many times we retry on failure?

The backoff limit is useful here to define an upper limit after which retries on failures are not performed. Source:- here

Can we limit the job to run in a specific time duration & fail it if it does not?

Once a Job reaches, all of its running Pods are terminated, and the Job status will become type: Failed with reason: DeadlineExceeded. Source:- here

Can we automatically cleanup Jobs? (might be difficult to remember cleaning them)

Finished Jobs (either Complete or Failed) cleanup automatically can be done using a TTL mechanism provided by a TTL controller for finished resources by specifying the ttlSecondsAfterFinished field of the Job. Source:- here

There are a lot more use-cases like running jobs in parallel, suspending jobs, etc. One can read the documentation for more details.

Cool cool, but how do we run a job regularly or irregularly on a given schedule?

Using cronJobs. Example:- Send an email every day at 5 am

Let’s try creating a cronjob

MyK8sInstance> kubectl create cronjob --image=busybox  myjob  -o yaml --dry-run --schedule="*/15 * * * *" > cronjob.yaml

Example of a cronjob YAML file

apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: myjob
spec:
jobTemplate:
metadata:
name: myjob
spec:
template:
spec:
containers:
- image: busybox
name: myjob
command: ["/bin/sh"]
args: ["-c", "echo Hello World from crobJob"]
restartPolicy: OnFailure
schedule: '*/1 * * * *'

Let’s apply the above and validate.

MyK8sInstance> kubectl apply -f cronjob.yaml
cronjob.batch/myjob created
MyK8sInstance> kubectl get cronjob
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
myjob */1 * * * * False 0 34s 99s

So what's happening here, are we internally creating a job using cronjob?

Absolutely, if we keep a watch on jobs, it makes it pretty clear

MyK8sInstance> kubectl get job -w
NAME COMPLETIONS DURATION AGE
myjob 1/1 4s 29m
myjob-1621876140 1/1 3s 2m15s
myjob-1621876200 1/1 3s 75s
myjob-1621876260 1/1 3s 15s
MyK8sInstance> kubectl get pods | grep myjob
myjob-1621876140-xzlks 0/1 Completed 0 3m10s
myjob-1621876200-ftzzn 0/1 Completed 0 2m10s
myjob-1621876260-c4n2p 0/1 Completed 0 70s
myjob-1621876320-q8bp4 0/1 Completed 0 10s
MyK8sInstance> kubectl logs myjob-1621876200-ftzzn
Hello World from crobJob

If the schedule syntax is confusing, you could read this. I usually like to use this for creating a cron schedule.

Let’s cleanup

This cleanup all the associated jobs and pods for the given cronjob.

MyK8sInstance> kubectl delete cronjob myjob
cronjob.batch "myjob" deleted
MyK8sInstance> kubectl get cronjob
No resources found in default namespace.
MyK8sInstance> kubectl get pods | grep myjob

Now that we have some theory under our belt let’s try some more examples to help us understand more.

Create a job named google-cron. The job should run every minute and should run the following command curl google.com. Terminate the container within 10 seconds if it does not run.

Using K8s documentation here, we create the below.

apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: google-cron
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: progrium/busybox
command: ["bin/sh"]
args: ["curl", "google.com"]
restartPolicy: OnFailure

What are we missing? We do not have a logic to terminate the container if it takes more than 10 seconds. For this, we use activeDeadlineSeconds

google-cron.yamlapiVersion: batch/v1beta1
kind: CronJob
metadata:
name: google-cron
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
activeDeadlineSeconds: 10
containers:
- name: hello
image: progrium/busybox
command: ["bin/sh"]
args: ["curl", "google.com"]
restartPolicy: OnFailure

Let’s try to apply the above & validate.

MyK8sInstance> kubectl apply -f google-cron.yaml
cronjob.batch/google-cron unchanged
MyK8sInstance> kubectl get jobs
NAME COMPLETIONS DURATION AGE
google-cron-1621877280 0/1 14s 14s
myjob 1/1 4s 46m
MyK8sInstance> kubectl get pods | grep google
google-cron-1621877280-d7w2c 0/1 CrashLoopBackOff 1 20s
google-cron-1621877280-dwvql 0/1 CrashLoopBackOff 1 30s
google-cron-1621877280-gh77f 0/1 CrashLoopBackOff 1 10s

But why are the pods in a CrashLoopBackOff?

MyK8sInstance> kubectl logs google-cron-1621877280-gh77f
bin/sh: can't open 'curl': No such file or directory

Below fixes the issue ( we install curl)

apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: google-cron
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
activeDeadlineSeconds: 10
containers:
- name: hello
image: progrium/busybox
command: ["/bin/sh"]
args: ["-c", " ;", "opkg-install curl", ";", "curl", "google.com"]
restartPolicy: OnFailure

If you would like to try more examples related to K8s jobs & cronjobs, try this.

Till next time ciao & stay safe 🙉

--

--

Sandeep Baldawa

whoami >> Slack, Prev — Springpath (Acquired by Cisco), VMware, Backend Engineer, Build & Release, Infra, Devops & Cybersecurity Enthusiast