Bulkload data from an XBRL repository, Part 4: Configure AWS S3 Repository

Estimated reading time: 1 minute

Part 4: Configure AWS S3 Repository

The CellStore ETL Tool can be configured to bulkload XBRL archives from S3. The configuration for the Tool needs to be added to the reportix.properties file:

ETL_<myConfigName>_SOURCE_TYPE=S3
ETL_<myConfigName>_AWS_ACCESS_KEY_ID=KKKKKKKKKKKKKKKKKK
ETL_<myConfigName>_AWS_SECRET_ACCESS_KEY=SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
ETL_<myConfigName>_AWS_REGION=us-east-2
ETL_<myConfigName>_AWS_BUCKET=my.data.bucket.example.com
ETL_<myConfigName>_LOG_TYPE=FILE
ETL_<myConfigName>_LOG_PATH=log/etl_%tF.log

Because there can be several configurations in the same reportix.properties each configuration needs to be given a custom name (replacing placeholder: <myConfigName>).

Config Entry Description
ETL_<myConfigName>_SOURCE_TYPE The source type (where the imported filings are fetched from)
ETL_<myConfigName>_AWS_ACCESS_KEY_ID The AWS access id for authentication (needs at least read permissions for the bucket)
ETL_<myConfigName>_AWS_SECRET_ACCESS_KEY The AWS secret key for authentication
ETL_<myConfigName>_AWS_REGION The AWS S3 region in which the bucket was created
ETL_<myConfigName>_AWS_BUCKET The AWS S3 bucket from which to fetch filings
ETL_<myConfigName>_LOG_TYPE How to log the results of each filing procession (Currently the only value and default is “FILE”)
ETL_<myConfigName>_LOG_PATH Where to log the results of each filing processed. Adds one log line for each filing. The default is log/etl_%tF.log (which stores the logs in the server home dir)

For example we can create one S3 source with name MASTER:

ETL_MASTER_SOURCE_TYPE=S3
ETL_MASTER_AWS_ACCESS_KEY_ID=AKIAJSL6ZPLEGE6QKD2Q
ETL_MASTER_AWS_SECRET_ACCESS_KEY=UDSRTanRJjGw7zOzZ/NotValid1onAiqXAytestdknp
ETL_MASTER_AWS_REGION=eu-central-1
ETL_MASTER_AWS_BUCKET=masterdata.example.com
ETL_MASTER_LOG_TYPE=FILE
ETL_MASTER_LOG_PATH=log/etl_%tF.log

On to Part 4 »

cellstore, etl-tool, etl tool, guide, tutorial, data repository, bulkload, etl, filesystem, restore, import, aws, s3