K8s Jobs & cronJobs
In the blog post, let’s try to understand what the K8s job object is and why do we need the same.
Types of Jobs
- Run to completion (Jobs)
Running jobs mostly for the specific instance of invocation without a schedule.
2. Scheduled (CronJob)
Running jobs at a scheduled instance of time
Why jobs? Why can't we just use a deployment to get our work done?
K8s Jobs are like a person on a mission, they do their work and return back to base. They can either do this one-off time or do this on a scheduled basis. More details here. Deployments try to keep the replicas alive & will bring up a new replica if some other replica goes down. So these are both opposite ends of a rainbow.
How to know what API's your K8s version supports?
This is a bit tangential, but important to know to understand all the APIs that your K8s installation supports.
- Run the below on command line
kubectl proxy --port=8080
Starting to serve on 127.0.0.1:8080
2. Now, if we go to http://127.0.0.1:8080/, we can see all the API's supported; specifically if we are looking for the job-related APIs, we can go to http://127.0.0.1:8080/apis/batch/v1
Alternatively, one could also try the below.
MyK8sInstance> kubectl api-resources --api-group=batch
NAME SHORTNAMES APIGROUP NAMESPACED KIND
cronjobs cj batch true CronJob
jobs batch true JobMyK8sInstance> kubectl explain job | less =====> Explains the apiMyK8sInstance> kubectlexplain job --recursive | less ===> Parameters
This can be used to find details about any supported APIs.
All right, all good. Can we talk business here? Just tell me how to create a job
Let’s try an example of creating a job for an image busybox. What we are trying to do here is to invoke a job that will run a command in the busybox container. We will start with a dry run option here. See the various fields below, is there something missing 🤔
MyK8sInstance> kubectl create job --image=busybox myjob -o yaml --dry-run
apiVersion: batch/v1
kind: Job
metadata:
creationTimestamp: null
name: myjob
spec:
template:
spec:
containers:
- image: busybox
name: myjob
What command is the above job running, looks like we don't have it in there
Let’s add the command piece in there; also note Only an RestartPolicy
equal to Never
or OnFailure
is allowed(which makes sense considering the purpose of why we need jobs for i.e., execute a task and move on, so there is no need to have the restart Policy be Always
). Check docs here
apiVersion: batch/v1
kind: Job
metadata:
creationTimestamp: null
name: myjob
spec:
template:
spec:
containers:
- image: busybox
name: myjob
command: ["/bin/sh"]
args: ["-c", "echo Hello World"]
restartPolicy: OnFailure
Let’s now create the job; something interesting to note is the job after running does not delete the pod.
MyK8sInstance> kubectl apply -f job.yaml
job.batch/myjob configured
MyK8sInstance> kubectl get jobs
NAME COMPLETIONS DURATION AGE
myjob 1/1 3s 4m32s
But how do we know it did it’s work(Print Hello World)?
MyK8sInstance> kubectl get po | grep myjob
myjob-rhfhm 0/1 Completed 0 6m38s
MyK8sInstance> kubectl logs myjob-rhfhm
Hello World
What happens if I delete the job, will the pods related to it remain?
Not really; the pods associated with a job are also deleted once a job is deleted.
MyK8sInstance> kubectl delete job myjob
job.batch "myjob" deletedMyK8sInstance> kubectl get po | grep myjob
How do we find all pods related to a given job?
Source:- here
MyK8sInstance> pods=$(kubectl get pods --selector=job-name=myjob --output=jsonpath='{.items[*].metadata.name}')MyK8sInstance> echo $pods
myjob-d7n8b
What if a job keeps failing, can we limit how many times we retry on failure?
The backoff
limit is useful here to define an upper limit after which retries on failures are not performed. Source:- here
Can we limit the job to run in a specific time duration & fail it if it does not?
Once a Job reaches, all of its running Pods are terminated, and the Job status will become type: Failed
with reason: DeadlineExceeded
. Source:- here
Can we automatically cleanup Jobs? (might be difficult to remember cleaning them)
Finished Jobs (either Complete
or Failed
) cleanup automatically can be done using a TTL mechanism provided by a TTL controller for finished resources by specifying the ttlSecondsAfterFinished
field of the Job. Source:- here
There are a lot more use-cases like running jobs in parallel, suspending jobs, etc. One can read the documentation for more details.
Cool cool, but how do we run a job regularly or irregularly on a given schedule?
Using cronJobs. Example:- Send an email every day at 5 am
Let’s try creating a cronjob
MyK8sInstance> kubectl create cronjob --image=busybox myjob -o yaml --dry-run --schedule="*/15 * * * *" > cronjob.yaml
Example of a cronjob YAML file
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: myjob
spec:
jobTemplate:
metadata:
name: myjob
spec:
template:
spec:
containers:
- image: busybox
name: myjob
command: ["/bin/sh"]
args: ["-c", "echo Hello World from crobJob"]
restartPolicy: OnFailure
schedule: '*/1 * * * *'
Let’s apply the above and validate.
MyK8sInstance> kubectl apply -f cronjob.yaml
cronjob.batch/myjob createdMyK8sInstance> kubectl get cronjob
NAME SCHEDULE SUSPEND ACTIVE LAST SCHEDULE AGE
myjob */1 * * * * False 0 34s 99s
So what's happening here, are we internally creating a job using cronjob?
Absolutely, if we keep a watch on jobs, it makes it pretty clear
MyK8sInstance> kubectl get job -w
NAME COMPLETIONS DURATION AGE
myjob 1/1 4s 29m
myjob-1621876140 1/1 3s 2m15s
myjob-1621876200 1/1 3s 75s
myjob-1621876260 1/1 3s 15sMyK8sInstance> kubectl get pods | grep myjob
myjob-1621876140-xzlks 0/1 Completed 0 3m10s
myjob-1621876200-ftzzn 0/1 Completed 0 2m10s
myjob-1621876260-c4n2p 0/1 Completed 0 70s
myjob-1621876320-q8bp4 0/1 Completed 0 10s
MyK8sInstance> kubectl logs myjob-1621876200-ftzzn
Hello World from crobJob
If the schedule syntax is confusing, you could read this. I usually like to use this for creating a cron schedule.
Let’s cleanup
This cleanup all the associated jobs and pods for the given cronjob.
MyK8sInstance> kubectl delete cronjob myjob
cronjob.batch "myjob" deletedMyK8sInstance> kubectl get cronjob
No resources found in default namespace.MyK8sInstance> kubectl get pods | grep myjob
Now that we have some theory under our belt let’s try some more examples to help us understand more.
Create a job named
google-cron
. The job should run every minute and should run the following commandcurl google.com
. Terminate the container within 10 seconds if it does not run.
Using K8s documentation here, we create the below.
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: google-cron
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
containers:
- name: hello
image: progrium/busybox
command: ["bin/sh"]
args: ["curl", "google.com"]
restartPolicy: OnFailure
What are we missing? We do not have a logic to terminate the container if it takes more than 10 seconds. For this, we use activeDeadlineSeconds
google-cron.yamlapiVersion: batch/v1beta1
kind: CronJob
metadata:
name: google-cron
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
activeDeadlineSeconds: 10
containers:
- name: hello
image: progrium/busybox
command: ["bin/sh"]
args: ["curl", "google.com"]
restartPolicy: OnFailure
Let’s try to apply the above & validate.
MyK8sInstance> kubectl apply -f google-cron.yaml
cronjob.batch/google-cron unchanged
MyK8sInstance> kubectl get jobs
NAME COMPLETIONS DURATION AGE
google-cron-1621877280 0/1 14s 14s
myjob 1/1 4s 46mMyK8sInstance> kubectl get pods | grep google
google-cron-1621877280-d7w2c 0/1 CrashLoopBackOff 1 20s
google-cron-1621877280-dwvql 0/1 CrashLoopBackOff 1 30s
google-cron-1621877280-gh77f 0/1 CrashLoopBackOff 1 10s
But why are the pods in a CrashLoopBackOff?
MyK8sInstance> kubectl logs google-cron-1621877280-gh77f
bin/sh: can't open 'curl': No such file or directory
Below fixes the issue ( we install curl)
apiVersion: batch/v1beta1
kind: CronJob
metadata:
name: google-cron
spec:
schedule: "*/1 * * * *"
jobTemplate:
spec:
template:
spec:
activeDeadlineSeconds: 10
containers:
- name: hello
image: progrium/busybox
command: ["/bin/sh"]
args: ["-c", " ;", "opkg-install curl", ";", "curl", "google.com"]
restartPolicy: OnFailure
If you would like to try more examples related to K8s jobs & cronjobs, try this.
Till next time ciao & stay safe 🙉