
File Asset Bundle
A set of Teraslice processors for working with data stored in files on disk. The readers utilize the
chunked-file-reader
module to break data into records.
Since all the readers in this asset bundle use DataEntities, the slice's file path can be retrieved from each record by using something like record.getMetadata('path')
. More information about DataEntities can be found here.
APIS
- file_reader_api
- file_sender_api
- hdfs_reader_api
- hdfs_sender_api
- s3_reader_api
- s3_sender_api
Operations
- file_exporter
- file_reader
- hdfs_append
- hdfs_reader
- s3_exporter
- s3_reader
Releases
You can find a list of releases, changes, and pre-built asset bundles here.
Getting Started
This asset bundle requires a running Teraslice cluster Documentation.
# Step 1: make sure you have teraslice-cli installed
yarn global add teraslice-cli
# Step 2:
# teraslice-cli assets deploy <cluster_alias> <asset-name[@version]>
# deploy the latest release to a teraslice cluster
teraslice-cli assets deploy cluster1 terascope/file-assets
# or deploy a specific version to a teraslice cluster
teraslice-cli assets deploy localCluster terascope/file-assets@3.0.5
# or build from source and deploy to a teraslice cluster
teraslice-cli assets deploy cluster2 --build
Connectors
Terafoundation connector for S3 compatible clients.
S3 Connector
Configuration:
The S3 connector configuration, in your Teraslice configuration file, includes the following parameters:
Configuration | Description | Type | Notes |
---|---|---|---|
endpoint | Target S3 HTTP endpoint, must be URL | String | optional, defaults to http://127.0.0.1:80 |
accessKeyId | S3 access key ID | String | required |
secretAccessKey | S3 secret access key | String | required |
region | AWS Region where bucket is located | String | optional, defaults to us-east-1 |
maxRetries | Maximum retry attempts | Number | optional, defaults to 3 |
sslEnabled | Flag to enable/disable SSL communication | Boolean | optional, defaults to true |
caCertificate | A string containing a single or multiple ca certificates | String | optional, defaults to ' ' |
certLocation | DEPRECATED - use caCertificate. Location of ssl cert | String | optional, defaults to ' ' |
forcePathStyle | Whether to force path style URLs for S3 objects | Boolean | optional, defaults to false |
bucketEndpoint | Whether to use the bucket name as the endpoint for this request | Boolean | optional, defaults to false |
Terafoundation S3 configuration example:
terafoundation:
connectors:
s3:
default:
endpoint: "http://localhost:9000"
accessKeyId: "yourId"
secretAccessKey: "yourPassword"
forcePathStyle: true
sslEnabled: true
caCertificate: |
-----BEGIN CERTIFICATE-----
MIICGTCCAZ+gAwIBAgIQCeCTZaz32ci5PhwLBCou8zAKBggqhkjOPQQDAzBOMQs
...
DXZDjC5Ty3zfDBeWUA==
-----END CERTIFICATE-----
Development
Tests
Run the file-assets tests
Requirements:
yarn test
Build
Build a compiled asset bundle to deploy to a teraslice cluster.
Install Teraslice CLI:
yarn global add teraslice-cli
teraslice-cli assets build
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Please make sure to update tests as appropriate.
License
MIT licensed.