Bulkload data from an XBRL repository, Part 1: Introduction and Installation
Estimated reading time: 1 minute- 1: Introduction and Installation
- 2: Metadata and Storage
- 3: Configure Local File System Repository
- 4: Configure AWS S3 Repository
- 5: Run CellStore ETL Tool
Part 1: Introduction and Installation
Introduction
Importing single archives, containing XBRL Instances or Discoverable Taxonomy Sets (Dts), into CellStore can be done either with the Admin UI, the Rest API import instances endpoint or import dts endpoint. However, importing large bulk data repositories of filings or taxonomies is inconvenient using the low level Rest API or manual processs using the Admin UI.
For this reason, Reportix provides a convenient CellStore ETL Tool that helps you load large bulks of data (or chunks thereof). We call the physical storage of such large bulks of data an “XBRL Repository”. An XBRL repository contains archives (zip files) each containing a single XBRL instance or DTS and optionally, adjoint metadata in a Json file with the same name.
CellStore ETL Tool is usually distributed as a compressed archive or through Reportix’ private Docker registry.
Loading CellStore ETL Tool image
The first step of the installation is loading CellStore ETL Tool image to your docker host.
If installing from a compressed archive:
gunzip -c cellstore-etl-vNEXT.gz | docker load
If installing from Reportix’ private Docker registry please request access to the Reportix Support. Once you have done so run:
docker login registry.reportix.com -u username
docker pull registry.reportix.com/cellstore/etl:vNEXT
Configuring the CellStore ETL Tool
After installation you need to setup the CellStore home dir containing a reportix.properties unless you have this already setup for CellStore itself. In this case we recommend to use the same CellStore home dir for the server and the ETL Tool (there is no need to setup a separate home dir for the ETL tool).
CellStore, ETL, bulkload, load data, data import, documentation, DTS