Skip to main content

Getting Started

Teraslice is a distributed data processing platform designed to run in kubernetes. There is also a native clustering mode used for development. You can interact with Teraslice using curl or the teraslice CLI client. Once running, see the Using Teraslice section for details on how to run your first job.

Setup Teraslice

Teraslice requires a connection to an elasticsearch or opensearch cluster in order to run correctly. Below is a quick guide to launch a functional local teraslice instance with opensearch1 using helmfile. See the helm examples directory or the e2e helm directory for more comprehensive helmfile examples.

Required dependencies

  • Docker - Application Containerization Platform
  • Helm - The package manager for Kubernetes
  • helmfile - Deploy Kubernetes Helm Charts
  • Kind - Kubernetes in Docker
  • curl - Command-line tool for making HTTP requests
  • teraslice-cli - A CLI tool for managing Teraslice

Create a new file called kindConfig.yaml and paste the following code snippet in it and save.

kind: Cluster
name: k8s-env
apiVersion: kind.x-k8s.io/v1alpha4
nodes:
- role: control-plane
extraPortMappings:
- containerPort: 30678 # Map internal teraslice api service to host port
hostPort: 5678
- containerPort: 30921 # Map internal opensearch1 service to host port
hostPort: 9200

Next run the kind command below to launch a kind cluster.

kind create cluster --config kindConfig.yaml

Next create a file called helmfile.yaml and paste the code below in it and save.

repositories:
- name: opensearch
url: https://opensearch-project.github.io/helm-charts/
- name: terascope
url: https://terascope.github.io/helm-charts/

helmDefaults:
wait: true

releases:
- name: opensearch1
namespace: ts-dev1
version: 2.17.1
chart: opensearch/opensearch
values:
- replicas: 1
singleNode: true
image:
tag: 1.3.14
service:
type: NodePort
nodePort: 30921
config:
opensearch.yml:
plugins:
security:
disabled: true
masterService: opensearch1

- name: teraslice
namespace: ts-dev1
version: 2.3.0
chart: terascope/teraslice-chart
needs:
- ts-dev1/opensearch1
values:
- terafoundation:
connectors:
elasticsearch-next:
default:
node:
- "http://opensearch1.ts-dev1:9200"
service:
nodePort: 30678
type: NodePort
master:
teraslice:
kubernetes_namespace: ts-dev1
cluster_manager_type: kubernetesV2
asset_storage_connection_type: elasticsearch-next
worker:
teraslice:
kubernetes_namespace: ts-dev1
cluster_manager_type: kubernetesV2
asset_storage_connection_type: elasticsearch-next

Run the following command to submit it to the local dev cluster:

helmfile sync

Using Teraslice

See the overview and terminology pages for details on how teraslice works. See the management APIs or the teraslice CLI client for useful commands.

To ensure teraslice is running make a curl request to the API

curl localhost:5678
{
"arch": "x64",
"clustering_type": "kubernetesV2",
"name": "teraslice",
"node_version": "v22.14.0",
"platform": "linux",
"teraslice_version": "v2.14.1"
}

Deploy Needed Assets

Asset bundles are collection of processors or files that can be loaded and used within a Job.

There are public asset bundles available for:

The example job below requires standard-assets and elasticsearch-assets to be available in the cluster for successful execution. Use the teraslice-cli tool to deploy these assets:

teraslice-cli assets deploy localhost terascope/standard-assets
teraslice-cli assets deploy localhost terascope/elasticsearch-assets

Submitting and Starting a Test Job

This example job generates 10,000 records using the standard-assets data generator and writes them to an Opensearch index named random-data-1. Submit the job to the Teraslice API using the following command:

curl -XPOST 'localhost:5678/v1/jobs' -H "Content-Type: application/json" -d '{
"name": "data-to-es",
"lifecycle": "once",
"workers": 1,
"assets": [
"standard",
"elasticsearch"
],
"operations": [
{
"_op": "data_generator",
"size": 10000
},
{
"_op": "elasticsearch_bulk",
"size": 10000,
"index": "random-data-1"
}
]
}'

Check the status of the job execution

curl localhost:5678/txt/ex

Run the command several times to see the execution status move from initializing to running to completed.

Viewing results in opensearch

Once the job completes, query Opensearch to verify that the documents have been written successfully to the random-data-1 index. Use the following command to view the index information:


curl 'localhost:9200/_cat/indices?v&h=index,status,docs.count,docs.deleted,store.size,pri.store.size'

Results:

index                        status docs.count docs.deleted store.size pri.store.size
teraslice__assets open 2 0 2.8mb 2.8mb
teraslice__state-2024.11 open 1 0 28.8kb 28.8kb
teraslice__ex open 1 0 49.1kb 49.1kb
teraslice__jobs open 1 0 5.6kb 5.6kb
random-data-1 open 10000 0 7mb 7mb
teraslice__analytics-2024.11 open 4 0 23.9kb 23.9kb