Bulkload data from an XBRL repository, Part 5: Run CellStore ETL Tool
Estimated reading time: 1 minute- 1: Introduction and Installation
- 2: Metadata and Storage
- 3: Configure Local File System Repository
- 4: Configure AWS S3 Repository
- 5: Run CellStore ETL Tool
Part 5: Run CellStore ETL Tool
On the host machine we have put our reportix.properties file into /var/reportix/cellstore where we can later find the logs.
Add Only Archives that are not yet in the Database
If you only want to import filings that are not yet loaded into the Database you can use the add-missing action.
Without an explicit instanceId in the metadata
add-missingmight not work as expected.If you have not set the
instanceIdwithin a metadata file and have configuredDEFAULT_INSTANCE_IDasUUIDin yourreportix.properties, then the ETL tool will generate a unique identifier. Hence, it will always add a new filing as the generated UUIDs will never collide (in reality there is a neglectable probability of collision).
With the example configuration from the previous steps we can run the CellStore ETL Tool as follows:
docker run -v /var/reportix/cellstore:/root/cellstore --restart=always --name cellstore-etl \
registry.reportix.com/cellstore/etl:vNEXT etl add-missing --config MYLOCALDIR \
--prefix test/instances/2017
With --prefix test/instances/2017 we are only loading the filings found in the according subdirectory (both from the local filesystem or s3).
Replace all Archives and Add new Filings to the Database
If you want to replace even existing archives within the database you can use the replace action.
This will not only add archives that are not yet loaded into the database, but also overwrite the once that have been loaded earlier.
This can be useful to reset a database to a certain point.
docker run -v /var/reportix/cellstore:/root/cellstore --restart=always --name cellstore-etl \
registry.reportix.com/cellstore/etl:vNEXT etl replace --config MASTER \
--prefix test/dts/2017
With --prefix test/dts/2017 we are only loading the archives found in the according subdirectory (both from the local filesystem or s3).