Lesson 1: Introducing Metafacture and requirements for this tutorial
Metafacture is an ETL data processing toolkit with a focus on library metadata. It provides a versatile set of tools for reading, writing and transforming data. It was initially developed by the DNB starting in 2011 and is maintained since 2019 by hbz.
Metafacture can be used as a stand-alone application or as a Java library in other applications. The name “Metafacture” is a portmanteau of the words metadata and manufacture.
In this tutorial we are going to teach how to use Metafacture to perform simple and advanced data processing tasks. To process data Metafacture can be used with the command line, as JAVA library or you can use the Metafacture Playground.
For this introduction we will start with the Playground since it allows a quick start without additional installing. The Metafacture Playground is a web interface to test and share Metafacture workflows and requires no local installation.
Starting with Chapter 6, we can switch from using Playground to running Metafacture on our own hardware but the examples are still provided in the Playground.
To run Metafacture on your local machine you need a Linux/Unix Bash Shell (part of every Linux, MacOS and Windows >=10) with metafacture-core installed. See Chapter 6 for details.
Getting started with the Metafacture Playground
You will see what a workflow in Metafacture in the Playground looks like in the following lessons. For a first overview, you will find an example workflow ‘How to use the Playground’ under the menu item ‘Load examples’. To run this example, just click on ‘Process’ in the menu.
Next lesson: 02 Introduction into Metafacture Flux