Available flux commands (with release 1.2.0)

add-oreaggregation

  • description: adds ore:Aggregation to an Europeana Data Model stream. The aggregation id is set by emitting literal(‘aggregation_id’, id)
  • signature: StreamReceiver -> StreamReceiver
  • java class: org.metafacture.linkeddata.OreAggregationAdder

add-preamble-epilogue

as-formeta-records

as-lines

as-records

batch-log

batch-reset

calculate-metrics

  • description: Calculates values for various cooccurrence metrics. The expected inputs are triples containing as subject the var name and as object the count. Marginal counts must appear first, joint counts second. Marinal counts must be written as 1:A, Joint counts as 2:A&B
  • signature: Triple -> Triple
  • java class: org.metafacture.statistics.CooccurrenceMetricCalculator

catch-object-exception

catch-stream-exception

change-id

  • description: By default changes the record ID to the value of the ‘_id’ literal (if present). Use the contructor to choose another literal as ID source.
  • options: keepidliteral (boolean), idliteral (String), keeprecordswithoutidliteral (boolean)
  • signature: StreamReceiver -> StreamReceiver
  • example in Playground
  • java class: org.metafacture.mangling.RecordIdChanger

collect-triples

count-triples

  • description: Counts triples
  • options: countpredicate (String), countby [SUBJECT, PREDICATE, OBJECT, ALL]
  • signature: Triple -> Triple
  • java class: org.metafacture.triples.TripleCount

decode-aseq

decode-csv

decode-formeta

decode-html

  • description: Decode HTML to metadata events. The attrValsAsSubfields option can be used to override the default attribute values to be used as subfields (e.g. by default link rel="canonical" href="http://example.org" becomes link.canonical). It expects an HTTP-style query string specifying as key the attributes whose value should be used as a subfield, and as value the attribute whose value should be the subfield value, e.g. the default contains link.rel=href. To use the HTML element text as the value (instead of another attribute), omit the value of the query-string key-value pair, e.g. title.lang. To add to the defaults, instead of replacing them, start with an &, e.g. &h3.class
  • options: attrvalsassubfields (String)
  • signature: Reader -> StreamReceiver
  • example in Playground
  • java class: org.metafacture.html.HtmlDecoder

decode-json

  • description: Decodes JSON to metadata events. The ‘recordPath’ option can be used to set a JsonPath to extract a path as JSON - or to split the data into multiple JSON documents.
  • options: recordid (String), booleanmarker (String), recordcount (int), arraymarker (String), arrayname (String), recordpath (String), numbermarker (String), allowcomments (boolean)
  • signature: String -> StreamReceiver
  • example in Playground
  • java class: org.metafacture.json.JsonDecoder

decode-mab

decode-marc21

decode-pica

  • description: Parses pica+ records. The parser only parses single records. A string containing multiple records must be split into individual records before passing it to PicaDecoder.
  • options: trimfieldnames (boolean), normalizedserialization (boolean), ignoremissingidn (boolean), skipemptyfields (boolean), normalizeutf8 (boolean)
  • signature: String -> StreamReceiver
  • example in Playground
  • java class: org.metafacture.biblio.pica.PicaDecoder

decode-string

  • description: Splits a String into several Strings, either by extracting parts that match a regexp or by splitting by a regexp.
  • options: mode [SPLIT, EXTRACT]
  • signature: String -> String
  • java class: org.metafacture.strings.StringDecoder

decode-xml

  • description: Reads an XML file and passes the XML events to a receiver. Set totalEntitySizeLimit="0" to allow unlimited XML entities.
  • options: totalentitysizelimit (String)
  • signature: Reader -> XmlReceiver
  • example in Playground
  • java class: org.metafacture.xml.XmlDecoder

decode-yaml

decouple

defer-stream

digest-file

discard-events

  • options: discardlifecycleevents (boolean), discardliteralevents (boolean), discardentityevents (boolean), discardrecordevents (boolean)
  • signature: StreamReceiver -> StreamReceiver
  • java class: org.metafacture.mangling.StreamEventDiscarder

draw-uniform-sample

encode-csv

  • description: Encodes each value in a record as a csv row.
  • options: includeheader (boolean), noquotes (boolean), separator (String), includerecordid (boolean)
  • signature: StreamReceiver -> String
  • example in Playground
  • java class: org.metafacture.csv.CsvEncoder

encode-formeta

encode-json

encode-literals

  • description: Outputs the name and value of each literal which is received as a string. Name and value are separated by a separator string. The default separator string is a tab. If a literal name is empty, only the value will be output without a separator. The module ignores record and entity events. In particular, this means that literal names are not prefixed by the name of the entity which contains them.
  • options: separator (String)
  • signature: StreamReceiver -> String
  • example in Playground
  • java class: org.metafacture.formatting.StreamLiteralFormatter

encode-marc21

encode-marcxml

  • description: Encodes a stream into MARCXML. If you can’t ensure valid MARC21 (e.g. the leader isn’t correct or not set as one literal) then set the parameter ensureCorrectMarc21Xml to true.
  • options: ensurecorrectmarc21xml (boolean), emitnamespace (boolean), xmlversion (String), formatted (boolean), xmlencoding (String)
  • signature: StreamReceiver -> String
  • example in Playground
  • java class: org.metafacture.biblio.marc21.MarcXmlEncoder

encode-pica

encode-xml

  • description: Encodes a stream as XML. Defaults: rootTag="records", recordTag="record", no attributeMarker.
  • options: recordtag (String), namespacefile (String), xmlheaderversion (String), writexmlheader (boolean), xmlheaderencoding (String), separateroots (boolean), roottag (String), valuetag (String), attributemarker (String), writeroottag (boolean), namespaces (String)
  • signature: StreamReceiver -> String
  • example in Playground
  • java class: org.metafacture.xml.SimpleXmlEncoder

encode-yaml

extract-element

filter

  • description: Filters a stream based on a morph definition. A record is accepted if the morph returns at least one non empty value.
  • signature: StreamReceiver -> StreamReceiver
  • java class: org.metafacture.metamorph.Filter

filter-duplicate-objects

filter-null-values

filter-records-by-path

filter-strings

filter-triples

  • description: Filters triple. The patterns for subject, predicate and object are disjunctive.
  • options: predicatepattern (String), objectpattern (String), passmatches (boolean), subjectpattern (String)
  • signature: Triple -> Triple
  • example in Playground
  • java class: org.metafacture.triples.TripleFilter

find-fix-paths

fix

  • description: Applies a fix transformation to the event stream, given as the path to a fix file or the fixes themselves.
  • options: repeatedfieldstoentities (boolean), strictness [PROCESS, RECORD, EXPRESSION], entitymembername (String), strictnesshandlesprocessexceptions (boolean)
  • signature: StreamReceiver -> StreamReceiver
  • java class: org.metafacture.metafix.Metafix

flatten

from-jdom-document

handle-cg-xml

handle-comarcxml

handle-generic-xml

  • description: A generic XML reader. Separates XML data in distinct records with the defined record tag name (default: recordtagname="record") If no matching record tag is found, the output will be empty. The handler breaks down XML elements with simple string values and optional attributes into entities with a value subfield (name configurable) and additional subfields for each attribute. Record tag and value tag names can be configured. Attributes can get an attributeMarker.
  • options: emitnamespace (boolean), recordtagname (String), attributemarker (String), valuetagname (String)
  • signature: XmlReceiver -> StreamReceiver
  • example in Playground
  • java class: org.metafacture.xml.GenericXmlHandler

handle-mabxml

handle-marcxml

  • description: A MARC XML reader. To read marc data without namespace specification set option namespace="". To ignore namespace specification set option `ignorenamespace=”true”.
  • options: namespace (String), ignorenamespace (boolean), attributemarker (String)
  • signature: XmlReceiver -> StreamReceiver
  • example in Playground
  • java class: org.metafacture.biblio.marc21.MarcXmlHandler

handle-picaxml

jscript

json-to-elasticsearch-bulk

lines-to-records

list-fix-paths

  • description: Lists all paths found in the input records. These paths can be used in a Fix to address fields. Options: count (output occurence frequency of each path, sorted by highest frequency first; default: true), template (for formatting the internal triple structure; default: ${o} | ${s} if count is true, else ${s})index (output individual repeated subfields and array elements with index numbers instead of ‘*’; default: false)
  • options: template (String), count (boolean), index (boolean)
  • signature: StreamReceiver -> String
  • example in Playground
  • java class: org.metafacture.metafix.ListFixPaths

list-fix-values

  • description: Lists all values found for the given path. The paths can be found using fix-list-paths. Options: count (output occurence frequency of each value, sorted by highest frequency first; default: true)template (for formatting the internal triple structure; default: ${o} | ${s} if count is true, else ${s})
  • options: template (String), count (boolean)
  • signature: StreamReceiver -> String
  • example in Playground
  • java class: org.metafacture.metafix.ListFixValues

literal-to-object

log-object

log-stream

log-stream-time

log-time

map-to-stream

match

merge-batch-stream

merge-same-ids

morph

normalize-unicode-stream

  • description: Normalises composed and decomposed Unicode characters.
  • options: normalizationform [NFD, NFC, NFKD, NFKC], normalizevalues (boolean), normalizeids (boolean), normalizekeys (boolean)
  • signature: StreamReceiver -> StreamReceiver
  • java class: org.metafacture.strings.StreamUnicodeNormalizer

normalize-unicode-string

object-batch-log

object-tee

object-to-literal

  • description: Outputs a record containing the input object as literal
  • options: recordid (String), literalname (String)
  • signature: Object -> StreamReceiver
  • java class: org.metafacture.mangling.ObjectToLiteral

open-file

open-http

  • description: Opens an HTTP resource. Supports setting HTTP header fields Accept, Accept-Charset, Accept-Encoding, Content-Encoding and Content-Type, as well as generic headers (separated by \n). Defaults: request method = GET, request url = @- (input data), request body = @- (input data) if request method supports body and input data not already used, Accept header (accept) = */*, Accept-Charset header (acceptcharset) = UTF-8, errorprefix = ERROR: .
  • options: method [DELETE, GET, HEAD, OPTIONS, POST, PUT, TRACE], contentencoding (String), header (String), [deprecated] encoding (String), body (String), acceptcharset (String), acceptencoding (String), url (String), contenttype (String), accept (String), errorprefix (String)
  • signature: String -> Reader
  • example in Playground
  • java class: org.metafacture.io.HttpOpener

open-oaipmh

  • description: Opens an OAI-PMH stream and passes a reader to the receiver. Mandatory arguments are: BASE_URL, DATE_FROM, DATE_UNTIL, METADATA_PREFIX, SET_SPEC .
  • options: setspec (String), datefrom (String), encoding (String), dateuntil (String), metadataprefix (String)
  • signature: String -> Reader
  • example in Playground
  • java class: org.metafacture.biblio.OaiPmhOpener

open-resource

open-tar

pass-through

print

rdf-macros

read-beacon

read-dir

  • description: Reads a directory and emits all filenames found.
  • options: filenamepattern (String), recursive (boolean)
  • signature: String -> String
  • java class: org.metafacture.files.DirReader

read-string

read-triples

record-to-entity

regex-decode

remodel-pica-multiscript

reorder-triple

  • description: Shifts subjectTo predicateTo and objectTo around
  • options: subjectfrom [SUBJECT, PREDICATE, OBJECT], objectfrom [SUBJECT, PREDICATE, OBJECT], predicatefrom [SUBJECT, PREDICATE, OBJECT]
  • signature: Triple -> Triple
  • java class: org.metafacture.triples.TripleReorder

reset-object-batch

retrieve-triple-objects

  • description: Uses the object value of the triple as a URL and emits a new triple in which the object value is replaced with the contents of the resource identified by the URL.
  • options: defaultencoding (String)
  • signature: Triple -> Triple
  • java class: org.metafacture.triples.TripleObjectRetriever

sleep

  • description: Lets the process sleep for a specific amount of time between objects.
  • options: sleeptime (int), timeunit [NANOSECONDS, MICROSECONDS, MILLISECONDS, SECONDS, MINUTES, HOURS, DAYS]
  • signature: Object -> Object
  • java class: org.metafacture.flowcontrol.ObjectSleeper

sort-triples

  • description: Sorts triples. Several options can be combined, e.g. by="object",numeric="true",order="decreasing" will numerically sort the Object of the triples in decreasing order (given that all Objects are indeed of numeric type).
  • options: by [SUBJECT, PREDICATE, OBJECT, ALL], numeric (boolean), order [INCREASING, DECREASING]
  • signature: Triple -> Triple
  • example in Playground
  • java class: org.metafacture.triples.TripleSort

split-lines

split-xml-elements

  • description: Splits elements (e.g. defining single records) residing in one XML document into multiple single XML documents.
  • options: elementname (String), xmldeclaration (String), toplevelelement (String)
  • signature: XmlReceiver -> StreamReceiver
  • java class: org.metafacture.xml.XmlElementSplitter

stream-count

stream-tee

  • description: Replicates an event stream to an arbitrary number of stream receivers.
  • signature: StreamReceiver -> StreamReceiver
  • java class: org.metafacture.plumbing.StreamTee

stream-to-triples

  • description: Emits the literals which are received as triples such that the name and value become the predicate and the object of the triple. The record id containing the literal becomes the subject. If ‘redirect’ is true, the value of the subject is determined by using either the value of a literal named ‘_id’, or for individual literals by prefixing their name with ‘{to:ID}’. Set ‘recordPredicate’ to encode a complete record in one triple. The value of ‘recordPredicate’ is used as the predicate of the triple. If ‘recordPredicate’ is set, no {to:ID}NAME-style redirects are possible.
  • options: redirect (boolean), recordpredicate (String)
  • signature: StreamReceiver -> Triple
  • example in Playground
  • java class: org.metafacture.triples.StreamToTriples

stream-to-xml

  • description: Encodes a stream as XML. Defaults: rootTag="records", recordTag="record", no attributeMarker.
  • options: recordtag (String), namespacefile (String), xmlheaderversion (String), writexmlheader (boolean), xmlheaderencoding (String), separateroots (boolean), roottag (String), valuetag (String), attributemarker (String), writeroottag (boolean), namespaces (String)
  • signature: StreamReceiver -> String
  • java class: org.metafacture.xml.SimpleXmlEncoder

string-list-map-to-stream

template

  • description: Builds a String from a template and an Object. Provide template in brackets. ${o} marks the place where the object is to be inserted. If the object is an instance of Triple ${s}, ${p} and ${o} are used instead.
  • signature: Object -> String
  • example in Playground
  • java class: org.metafacture.formatting.ObjectTemplate

thread-object-tee

to-jdom-document

triples-to-stream

validate-json

  • description: Validate JSON against a given schema, send only valid input to the receiver. Pass the schema location to validate against. Write valid and/or invalid output to locations specified with writeValid and writeInvalid. Set the JSON key for the record ID value with idKey (for logging output, defaults to id).
  • options: idkey (String), writeinvalid (String), writevalid (String)
  • signature: String -> String
  • java class: org.metafacture.json.JsonValidator

wait-for-inputs

write

  • description: Writes objects to stdout or a file
  • arguments: [stdout, PATH]
  • options: appendiffileexists (boolean), footer (String), header (String), encoding (String), compression [NONE, AUTO, BZIP2, GZIP, PACK200, XZ], separator (String)
  • signature: Object -> Void
  • java class: org.metafacture.io.ObjectWriter

write-files

  • description: Writes objects to one (or more) file(s)
  • options: appendiffileexists (boolean), footer (String), header (String), encoding (String), compression [NONE, AUTO, BZIP2, GZIP, PACK200, XZ], separator (String)
  • signature: Object -> Void
  • java class: org.metafacture.io.ObjectFileWriter

write-triple-objects

  • description: Writes the object value of the triple into a file. The filename is constructed from subject and predicate. Please note: This module does not check if the filename constructed from subject and predicate stays within baseDir. THIS MODULE SHOULD NOT BE USED IN ENVIRONMENTS IN WHICH THE VALUES OF SUBJECT AND PREDICATE A PROVIDED BY AN UNTRUSTED SOURCE!
  • options: encoding (String)
  • signature: Triple -> Void
  • java class: org.metafacture.triples.TripleObjectWriter

write-triples

write-xml-files

  • description: Writes the XML into the filesystem. The filename is constructed from the XPATH given as ‘property’. Variables are:target (determining the output directory), property (the element in the XML entity. Constitutes the main part of the file’s name.), startIndex ( a subfolder will be extracted out of the filename. This marks the index’ beginning ), stopIndex ( a subfolder will be extracted out of the filename. This marks the index’ end )
  • options: endindex (int), startindex (int), property (String), filesuffix (String), encoding (String), compression (String), target (String)
  • signature: StreamReceiver -> Void
  • java class: org.metafacture.xml.XmlFilenameWriter

xml-tee